Scene text spotting based on end-to-end

被引:0
|
作者
Wei G. [1 ,2 ]
Rong W. [1 ]
Liang Y. [1 ]
Xiao X. [1 ]
Liu X. [1 ]
机构
[1] College of Computer Science and Engineering, Shandong University of Science and Technology, Shandong, Qingdao
[2] College of Intelligent Equipment, Shandong University of Science and Technology, Shandong, Taian
关键词
End-to-end; Joint optimization; SAM-BiLSTM; Scene text spotting; TCM;
D O I
10.3233/JIFS-200903
中图分类号
TN911 [通信理论];
学科分类号
081002 ;
摘要
Aiming at the problem that the traditional OCR processing method ignores the inherent connection between the text detection task and the text recognition task, This paper propose a novel end-to-end text spotting framework. The framework includes three parts: shared convolutional feature network, text detector and text recognizer. By sharing convolutional feature network, the text detection network and the text recognition network can be jointly optimized at the same time. On the one hand, it can reduce the computational burden; on the other hand, it can effectively use the inherent connection between text detection and text recognition. This model add the TCM (Text Context Module) on the basis of Mask RCNN, which can effectively solve the negative sample problem in text detection tasks. This paper propose a text recognition model based on the SAM-BiLSTM (spatial attention mechanism with BiLSTM), which can more effectively extract the semantic information between characters. This model significantly surpasses state-of-the-art methods on a number of text detection and text spotting benchmarks, including ICDAR 2015, Total-Text. © 2021 - IOS Press. All rights reserved.
引用
收藏
页码:8871 / 8881
页数:10
相关论文
共 50 条
  • [1] Toward Arbitrary-Shaped Text Spotting Based on End-to-End
    Wei, Guangcun
    Rong, Wansheng
    Liang, Yongquan
    Xiao, Xinguang
    Liu, Xiang
    IEEE ACCESS, 2020, 8 (08): : 159906 - 159914
  • [2] Transformer-based end-to-end scene text recognition
    Zhu, Xinghao
    Zhang, Zhi
    PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 1691 - 1695
  • [3] Character Can Speak Directly: An End-to-End Character Region Excavation Network for Scene Text Spotting
    Li, Yan
    Shu, Yan
    Li, Binyang
    Xu, Ruifeng
    ELECTRONICS, 2025, 14 (05):
  • [4] EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition
    Hao, Jiedong
    Wen, Yafei
    Deng, Jie
    Gan, Jun
    Ren, Shuai
    Tan, Hui
    Chen, Xiaoxin
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 95 - 108
  • [5] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
    Liao, Minghui
    Lyu, Pengyuan
    He, Minghang
    Yao, Cong
    Wu, Wenhao
    Bai, Xiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) : 532 - 548
  • [6] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
    Lyu, Pengyuan
    Liao, Minghui
    Yao, Cong
    Wu, Wenhao
    Bai, Xiang
    COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 71 - 88
  • [7] End-to-end scene text recognition using tree-structured models
    Shi, Cunzhao
    Wang, Chunheng
    Xiao, Baihua
    Gao, Song
    Hu, Jinlong
    PATTERN RECOGNITION, 2014, 47 (09) : 2853 - 2866
  • [8] Visual place recognition from end-to-end semantic scene text features
    Raisi, Zobeir
    Zelek, John
    FRONTIERS IN ROBOTICS AND AI, 2024, 11
  • [9] End-to-end Keywords Spotting Based on Connectionist Temporal Classification for Mandarin
    Bai, Ye
    Yi, Jiangyan
    Ni, Hao
    Wen, Zhengqi
    Liu, Bin
    Li, Ya
    Tao, Jianhua
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [10] End-to-End Multi-Look Keyword Spotting
    Yu, Meng
    Ji, Xuan
    Wu, Bo
    Su, Dan
    Yu, Dong
    INTERSPEECH 2020, 2020, : 66 - 70