Scene text spotting based on end-to-end

被引:0
|
作者
Wei G. [1 ,2 ]
Rong W. [1 ]
Liang Y. [1 ]
Xiao X. [1 ]
Liu X. [1 ]
机构
[1] College of Computer Science and Engineering, Shandong University of Science and Technology, Shandong, Qingdao
[2] College of Intelligent Equipment, Shandong University of Science and Technology, Shandong, Taian
来源
关键词
End-to-end; Joint optimization; SAM-BiLSTM; Scene text spotting; TCM;
D O I
10.3233/JIFS-200903
中图分类号
TN911 [通信理论];
学科分类号
081002 ;
摘要
Aiming at the problem that the traditional OCR processing method ignores the inherent connection between the text detection task and the text recognition task, This paper propose a novel end-to-end text spotting framework. The framework includes three parts: shared convolutional feature network, text detector and text recognizer. By sharing convolutional feature network, the text detection network and the text recognition network can be jointly optimized at the same time. On the one hand, it can reduce the computational burden; on the other hand, it can effectively use the inherent connection between text detection and text recognition. This model add the TCM (Text Context Module) on the basis of Mask RCNN, which can effectively solve the negative sample problem in text detection tasks. This paper propose a text recognition model based on the SAM-BiLSTM (spatial attention mechanism with BiLSTM), which can more effectively extract the semantic information between characters. This model significantly surpasses state-of-the-art methods on a number of text detection and text spotting benchmarks, including ICDAR 2015, Total-Text. © 2021 - IOS Press. All rights reserved.
引用
收藏
页码:8871 / 8881
页数:10
相关论文
共 50 条
  • [21] End-to-End Analysis for Text Detection and Recognition in Natural Scene Images
    Alnefaie, Ahlam
    Gupta, Deepak
    Bhuyan, Monowar H.
    Razzak, Imran
    Gupta, Prashant
    Prasad, Mukesh
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [22] An end-to-end model for multi-view scene text recognition
    Banerjee, Ayan
    Shivakumara, Palaiahnakote
    Bhattacharya, Saumik
    Pal, Umapada
    Liu, Cheng-Lin
    PATTERN RECOGNITION, 2024, 149
  • [23] Improvement of the end-to-end scene text recognition method for "text-to-speech" conversion
    Makhmudov, Fazliddin
    Mukhiddinov, Mukhriddin
    Abdusalomov, Akmalbek
    Avazov, Kuldoshbay
    Khamdamov, Utkir
    Cho, Young Im
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2020, 18 (06)
  • [24] Feature Fusion Pyramid Network for End-to-End Scene Text Detection
    Wu, Yirui
    Zhang, Lilai
    Li, Hao
    Zhang, Yunfei
    Wan, Shaohua
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (11)
  • [25] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
    Liao, Minghui
    Lyu, Pengyuan
    He, Minghang
    Yao, Cong
    Wu, Wenhao
    Bai, Xiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) : 532 - 548
  • [26] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
    Lyu, Pengyuan
    Liao, Minghui
    Yao, Cong
    Wu, Wenhao
    Bai, Xiang
    COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 71 - 88
  • [27] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework
    Busta, Michal
    Neumann, Lukas
    Matas, Jiri
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2223 - 2231
  • [28] Visual place recognition from end-to-end semantic scene text features
    Raisi, Zobeir
    Zelek, John
    FRONTIERS IN ROBOTICS AND AI, 2024, 11
  • [29] SEE: Towards Semi-Supervised End-to-End Scene Text Recognition
    Bartz, Christian
    Yang, Haojin
    Meinel, Christoph
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 6674 - 6681
  • [30] End-to-end scene text recognition using tree-structured models
    Shi, Cunzhao
    Wang, Chunheng
    Xiao, Baihua
    Gao, Song
    Hu, Jinlong
    PATTERN RECOGNITION, 2014, 47 (09) : 2853 - 2866