Scene text spotting based on end-to-end

被引：0

作者：

Wei G. ^{[1
,2
]}

Rong W. ^{[1
]}

Liang Y. ^{[1
]}

Xiao X. ^{[1
]}

Liu X. ^{[1
]}

机构：

[1] College of Computer Science and Engineering, Shandong University of Science and Technology, Shandong, Qingdao

[2] College of Intelligent Equipment, Shandong University of Science and Technology, Shandong, Taian

来源：

Journal of Intelligent and Fuzzy Systems | 2021年 / 40卷 / 05期

关键词：

End-to-end; Joint optimization; SAM-BiLSTM; Scene text spotting; TCM;

D O I：

10.3233/JIFS-200903

中图分类号：

TN911 [通信理论];

学科分类号：

081002 ;

摘要：

Aiming at the problem that the traditional OCR processing method ignores the inherent connection between the text detection task and the text recognition task, This paper propose a novel end-to-end text spotting framework. The framework includes three parts: shared convolutional feature network, text detector and text recognizer. By sharing convolutional feature network, the text detection network and the text recognition network can be jointly optimized at the same time. On the one hand, it can reduce the computational burden; on the other hand, it can effectively use the inherent connection between text detection and text recognition. This model add the TCM (Text Context Module) on the basis of Mask RCNN, which can effectively solve the negative sample problem in text detection tasks. This paper propose a text recognition model based on the SAM-BiLSTM (spatial attention mechanism with BiLSTM), which can more effectively extract the semantic information between characters. This model significantly surpasses state-of-the-art methods on a number of text detection and text spotting benchmarks, including ICDAR 2015, Total-Text. © 2021 - IOS Press. All rights reserved.

引用

页码：8871 / 8881

页数：10

共 50 条

[1] Toward Arbitrary-Shaped Text Spotting Based on End-to-End
Wei, Guangcun
Rong, Wansheng
Liang, Yongquan
Xiao, Xinguang
Liu, Xiang
IEEE ACCESS, 2020, 8 (08): : 159906 - 159914
[2] Transformer-based end-to-end scene text recognition
Zhu, Xinghao
Zhang, Zhi
PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 1691 - 1695
[3] Character Can Speak Directly: An End-to-End Character Region Excavation Network for Scene Text Spotting
Li, Yan
Shu, Yan
Li, Binyang
Xu, Ruifeng
ELECTRONICS, 2025, 14 (05):
[4] EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition
Hao, Jiedong
Wen, Yafei
Deng, Jie
Gan, Jun
Ren, Shuai
Tan, Hui
Chen, Xiaoxin
DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 95 - 108
[5] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
Liao, Minghui
Lyu, Pengyuan
He, Minghang
Yao, Cong
Wu, Wenhao
Bai, Xiang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) : 532 - 548
[6] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
Lyu, Pengyuan
Liao, Minghui
Yao, Cong
Wu, Wenhao
Bai, Xiang
COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 71 - 88
[7] End-to-end scene text recognition using tree-structured models
Shi, Cunzhao
Wang, Chunheng
Xiao, Baihua
Gao, Song
Hu, Jinlong
PATTERN RECOGNITION, 2014, 47 (09) : 2853 - 2866
[8] Visual place recognition from end-to-end semantic scene text features
Raisi, Zobeir
Zelek, John
FRONTIERS IN ROBOTICS AND AI, 2024, 11
[9] End-to-end Keywords Spotting Based on Connectionist Temporal Classification for Mandarin
Bai, Ye
Yi, Jiangyan
Ni, Hao
Wen, Zhengqi
Liu, Bin
Li, Ya
Tao, Jianhua
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[10] End-to-End Multi-Look Keyword Spotting
Yu, Meng
Ji, Xuan
Wu, Bo
Su, Dan
Yu, Dong
INTERSPEECH 2020, 2020, : 66 - 70

← 1 2 3 4 5 →