Scene text spotting based on end-to-end

被引:0
|
作者
Wei G. [1 ,2 ]
Rong W. [1 ]
Liang Y. [1 ]
Xiao X. [1 ]
Liu X. [1 ]
机构
[1] College of Computer Science and Engineering, Shandong University of Science and Technology, Shandong, Qingdao
[2] College of Intelligent Equipment, Shandong University of Science and Technology, Shandong, Taian
关键词
End-to-end; Joint optimization; SAM-BiLSTM; Scene text spotting; TCM;
D O I
10.3233/JIFS-200903
中图分类号
TN911 [通信理论];
学科分类号
081002 ;
摘要
Aiming at the problem that the traditional OCR processing method ignores the inherent connection between the text detection task and the text recognition task, This paper propose a novel end-to-end text spotting framework. The framework includes three parts: shared convolutional feature network, text detector and text recognizer. By sharing convolutional feature network, the text detection network and the text recognition network can be jointly optimized at the same time. On the one hand, it can reduce the computational burden; on the other hand, it can effectively use the inherent connection between text detection and text recognition. This model add the TCM (Text Context Module) on the basis of Mask RCNN, which can effectively solve the negative sample problem in text detection tasks. This paper propose a text recognition model based on the SAM-BiLSTM (spatial attention mechanism with BiLSTM), which can more effectively extract the semantic information between characters. This model significantly surpasses state-of-the-art methods on a number of text detection and text spotting benchmarks, including ICDAR 2015, Total-Text. © 2021 - IOS Press. All rights reserved.
引用
收藏
页码:8871 / 8881
页数:10
相关论文
共 50 条
  • [31] End-to-End Transformer-Based Open-Vocabulary Keyword Spotting with Location-Guided Local Attention
    Wei, Bo
    Yang, Meirong
    Zhang, Tao
    Tang, Xiao
    Huang, Xing
    Kim, Kyuhong
    Lee, Jaeyun
    Cho, Kiho
    Park, Sung-Un
    INTERSPEECH 2021, 2021, : 361 - 365
  • [32] End-to-end text-to-speech synthesis with unaligned multiple language units based on attention
    Aso, Masashi
    Takamichi, Shinnosuke
    Saruwatari, Hiroshi
    INTERSPEECH 2020, 2020, : 4009 - 4013
  • [33] Myanmar Text-to-Speech Synthesis Using End-to-End Model
    Qin, Qinglai
    Yang, Jian
    Li, Peiying
    2020 4TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2020, 2020, : 6 - 11
  • [34] TRIE: End-to-End Text Reading and Information Extraction for Document Understanding
    Zhang, Peng
    Xu, Yunlu
    Cheng, Zhanzhan
    Pu, Shiliang
    Lu, Jing
    Qiao, Liang
    Niu, Yi
    Wu, Fei
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1413 - 1422
  • [35] End-to-end hard constrained text generation via incrementally predicting segments
    Nie, Jinran
    Huang, Xuancheng
    Liu, Yang
    Kong, Cunliang
    Liu, Xin
    Yang, Liner
    Yang, Erhong
    KNOWLEDGE-BASED SYSTEMS, 2023, 278
  • [36] Re-weighted Interval Loss for Handling Data Imbalance Problem of End-to-End Keyword Spotting
    Zhang, Kun
    Wu, Zhiyong
    Yuan, Daode
    Luan, Jian
    Jia, Jia
    Meng, Helen
    Song, Binheng
    INTERSPEECH 2020, 2020, : 2567 - 2571
  • [37] TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network
    Sun, Yipeng
    Zhang, Chengquan
    Huang, Zuming
    Liu, Jiaming
    Han, Junyu
    Ding, Errui
    COMPUTER VISION - ACCV 2018, PT III, 2019, 11363 : 83 - 99
  • [38] On the Training and Testing Data Preparation for End-to-End Text-to-Speech Application
    Duc Chung Tran
    Khan, M. K. A. Ahamed
    Sridevi, S.
    2020 11TH IEEE CONTROL AND SYSTEM GRADUATE RESEARCH COLLOQUIUM (ICSGRC), 2020, : 73 - 75
  • [39] An End-to-End Depression Recognition Method Based on EEGNet
    Liu, Bo
    Chang, Hongli
    Peng, Kang
    Wang, Xuenan
    FRONTIERS IN PSYCHIATRY, 2022, 13
  • [40] RefineNet-based End-to-end Speech Enhancement
    Lan T.
    Peng C.
    Li S.
    Qian Y.-X.
    Chen C.
    Liu Q.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (02): : 554 - 563