Toward Arbitrary-Shaped Text Spotting Based on End-to-End

被引:4
|
作者
Wei, Guangcun [1 ,2 ]
Rong, Wansheng [1 ]
Liang, Yongquan [1 ]
Xiao, Xinguang [1 ]
Liu, Xiang [1 ]
机构
[1] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao 266590, Peoples R China
[2] Shandong Univ Sci & Technol, Coll Intelligent Equipment, Tai An 271019, Shandong, Peoples R China
来源
IEEE ACCESS | 2020年 / 8卷 / 08期
关键词
Text recognition; Feature extraction; Task analysis; Detectors; Optimization; Convolution; Optical character recognition software; Natural scene text spotting; SA-BiLSTM; end-to-end; joint optimization; SCENE TEXT; RECOGNITION;
D O I
10.1109/ACCESS.2020.3020387
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
At present, text spotting in natural scenes has become one of the research hotspots. Among them, curvilinear text and long text are the main difficulties of text spotting in natural scenes. To better solve these two types of problems, we propose a novel end-to-end text spotting model. The model includes three parts: shared convolution module, text detector module and text recognizer module. For the problem of long text, we adopt the corner attention mechanism to extract the features of long text more effectively. For the problem of curve text, we feed the rectification feature map into the SA-BiLSTM decoder to recognize the curve text more effectively. More importantly, the joint optimization strategy realizes the mutual promotion function of the text detection task and the text recognition task. Experimental results on TotalText, ICDAR2015, ICDAR2013, CTW1500, COCO-Text and MLT datasets prove that our method achieves excellent performance and robustness in text spotting tasks based on end-to-end natural scenes.
引用
收藏
页码:159906 / 159914
页数:9
相关论文
共 50 条
  • [31] A COMPARATIVE STUDY ON END-TO-END SPEECH TO TEXT TRANSLATION
    Bahar, Parnia
    Bieschke, Tobias
    Ney, Hermann
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 792 - 799
  • [32] End-to-End Mongolian Text-to-Speech System
    Li, Jingdong
    Zhang, Hui
    Liu, Rui
    Zhang, Xueliang
    Bao, Feilong
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 483 - 487
  • [33] EILPR: Toward End-to-End Irregular License Plate Recognition Based on Automatic Perspective Alignment
    Xu, Hui
    Zhou, Xiang-Dong
    Li, Zhenghao
    Liu, Liangchen
    Li, Chaojie
    Shi, Yu
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (03) : 2586 - 2595
  • [34] AN END-TO-END FAR-FIELD KEYWORD SPOTTING SYSTEM WITH NEURAL BEAMFORMING
    Ji, Xuan
    Lu, Lu
    Fang, Fuming
    Ma, Jianbo
    Zhu, Lei
    Li, Jinke
    Zhao, Dongdi
    Liu, Ming
    Jiang, Feijun
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 892 - 899
  • [35] ADVERSARIAL EXAMPLES FOR IMPROVING END-TO-END ATTENTION-BASED SMALL-FOOTPRINT KEYWORD SPOTTING
    Wang, Xiong
    Sun, Sining
    Shan, Changhao
    Hou, Jingyong
    Xie, Lei
    Li, Shen
    Lei, Xin
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6366 - 6370
  • [36] A Hyper-Network Based End-to-End Visual Servoing With Arbitrary Desired Poses
    Yu, Hongxiang
    Chen, Anzhe
    Xu, Kechun
    Zhou, Zhongxiang
    Jing, Wei
    Wang, Yue
    Xiong, Rong
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (08) : 4769 - 4776
  • [37] LSTM-Based End-to-End Framework for Biomedical Event Extraction
    Yu, Xinyi
    Rong, Wenge
    Liu, Jingshuang
    Zhou, Deyu
    Ouyang, Yuanxin
    Xiong, Zhang
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (06) : 2029 - 2039
  • [38] MMU-OCR-21: Towards End-to-End Urdu Text Recognition Using Deep Learning
    Nasir, Tayyab
    Malik, Muhammad Kamran
    Shahzad, Khurram
    IEEE ACCESS, 2021, 9 : 124945 - 124962
  • [39] TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network
    Sun, Yipeng
    Zhang, Chengquan
    Huang, Zuming
    Liu, Jiaming
    Han, Junyu
    Ding, Errui
    COMPUTER VISION - ACCV 2018, PT III, 2019, 11363 : 83 - 99
  • [40] Met-MLTS: Leveraging Smartphones for End-to-End Spotting of Multilingual Oriented Scene Texts and Traffic Signs in Adverse Meteorological Conditions
    Bagi, Randheer
    Dutta, Tanima
    Nigam, Nitika
    Verma, Deepali
    Gupta, Hari Prabhat
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) : 12801 - 12810