Scene text spotting based on end-to-end

被引:0
|
作者
Wei G. [1 ,2 ]
Rong W. [1 ]
Liang Y. [1 ]
Xiao X. [1 ]
Liu X. [1 ]
机构
[1] College of Computer Science and Engineering, Shandong University of Science and Technology, Shandong, Qingdao
[2] College of Intelligent Equipment, Shandong University of Science and Technology, Shandong, Taian
来源
关键词
End-to-end; Joint optimization; SAM-BiLSTM; Scene text spotting; TCM;
D O I
10.3233/JIFS-200903
中图分类号
TN911 [通信理论];
学科分类号
081002 ;
摘要
Aiming at the problem that the traditional OCR processing method ignores the inherent connection between the text detection task and the text recognition task, This paper propose a novel end-to-end text spotting framework. The framework includes three parts: shared convolutional feature network, text detector and text recognizer. By sharing convolutional feature network, the text detection network and the text recognition network can be jointly optimized at the same time. On the one hand, it can reduce the computational burden; on the other hand, it can effectively use the inherent connection between text detection and text recognition. This model add the TCM (Text Context Module) on the basis of Mask RCNN, which can effectively solve the negative sample problem in text detection tasks. This paper propose a text recognition model based on the SAM-BiLSTM (spatial attention mechanism with BiLSTM), which can more effectively extract the semantic information between characters. This model significantly surpasses state-of-the-art methods on a number of text detection and text spotting benchmarks, including ICDAR 2015, Total-Text. © 2021 - IOS Press. All rights reserved.
引用
收藏
页码:8871 / 8881
页数:10
相关论文
共 50 条
  • [1] Towards Unconstrained End-to-End Text Spotting
    Qin, Siyang
    Bissacco, Alessandro
    Raptis, Michalis
    Fujii, Yasuhisa
    Xiao, Ying
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4703 - 4713
  • [2] End-to-End Video Text Spotting with Transformer
    Wu, Weijia
    Cai, Yuanqiang
    Shen, Chunhua
    Zhang, Debing
    Fu, Ying
    Zhou, Hong
    Luo, Ping
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (09) : 4019 - 4035
  • [3] End-to-End Scene Text Recognition
    Wang, Kai
    Babenko, Boris
    Belongie, Serge
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 1457 - 1464
  • [4] Toward Arbitrary-Shaped Text Spotting Based on End-to-End
    Wei, Guangcun
    Rong, Wansheng
    Liang, Yongquan
    Xiao, Xinguang
    Liu, Xiang
    IEEE ACCESS, 2020, 8 (08): : 159906 - 159914
  • [5] An End-to-End Scene Text Recognition for Bilingual Text
    Albalawi, Bayan M.
    Jamal, Amani T.
    Al Khuzayem, Lama A.
    Alsaedi, Olaa A.
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (09)
  • [6] Towards End-to-End Text Spotting in Natural Scenes
    Wang, Peng
    Li, Hui
    Shen, Chunhua
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 7266 - 7281
  • [7] Transformer-based end-to-end scene text recognition
    Zhu, Xinghao
    Zhang, Zhi
    PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 1691 - 1695
  • [8] Character Can Speak Directly: An End-to-End Character Region Excavation Network for Scene Text Spotting
    Li, Yan
    Shu, Yan
    Li, Binyang
    Xu, Ruifeng
    ELECTRONICS, 2025, 14 (05):
  • [9] TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting
    Feng, Wei
    He, Wenhao
    Yin, Fei
    Zhang, Xu-Yao
    Liu, Cheng-Lin
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9075 - 9084
  • [10] Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting
    Qiao, Liang
    Tang, Sanli
    Cheng, Zhanzhan
    Xu, Yunlu
    Niu, Yi
    Pu, Shiliang
    Wu, Fei
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11899 - 11907