Scene text spotting based on end-to-end

被引：0

作者：

Wei G. ^{[1
,2
]}

Rong W. ^{[1
]}

Liang Y. ^{[1
]}

Xiao X. ^{[1
]}

Liu X. ^{[1
]}

机构：

[1] College of Computer Science and Engineering, Shandong University of Science and Technology, Shandong, Qingdao

[2] College of Intelligent Equipment, Shandong University of Science and Technology, Shandong, Taian

来源：

Journal of Intelligent and Fuzzy Systems | 2021年 / 40卷 / 05期

关键词：

End-to-end; Joint optimization; SAM-BiLSTM; Scene text spotting; TCM;

D O I：

10.3233/JIFS-200903

中图分类号：

TN911 [通信理论];

学科分类号：

081002 ;

摘要：

Aiming at the problem that the traditional OCR processing method ignores the inherent connection between the text detection task and the text recognition task, This paper propose a novel end-to-end text spotting framework. The framework includes three parts: shared convolutional feature network, text detector and text recognizer. By sharing convolutional feature network, the text detection network and the text recognition network can be jointly optimized at the same time. On the one hand, it can reduce the computational burden; on the other hand, it can effectively use the inherent connection between text detection and text recognition. This model add the TCM (Text Context Module) on the basis of Mask RCNN, which can effectively solve the negative sample problem in text detection tasks. This paper propose a text recognition model based on the SAM-BiLSTM (spatial attention mechanism with BiLSTM), which can more effectively extract the semantic information between characters. This model significantly surpasses state-of-the-art methods on a number of text detection and text spotting benchmarks, including ICDAR 2015, Total-Text. © 2021 - IOS Press. All rights reserved.

引用

页码：8871 / 8881

页数：10

共 50 条

[21] AN END-TO-END FAR-FIELD KEYWORD SPOTTING SYSTEM WITH NEURAL BEAMFORMING
Ji, Xuan
Lu, Lu
Fang, Fuming
Ma, Jianbo
Zhu, Lei
Li, Jinke
Zhao, Dongdi
Liu, Ming
Jiang, Feijun
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 892 - 899
[22] ADVERSARIAL EXAMPLES FOR IMPROVING END-TO-END ATTENTION-BASED SMALL-FOOTPRINT KEYWORD SPOTTING
Wang, Xiong
Sun, Sining
Shan, Changhao
Hou, Jingyong
Xie, Lei
Li, Shen
Lei, Xin
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6366 - 6370
[23] DiZNet: An end-to-end text detection and recognition algorithm with detail in text zone
Zhou, Di
Zhang, Jianxun
Li, Chao
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 104
[24] RTNet: An End-to-End Method for Handwritten Text Image Translation
Su, Tonghua
Liu, Shuchen
Zhou, Shengjie
DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 99 - 113
[25] FREE: A Fast and Robust End-to-End Video Text Spotter
Cheng, Zhanzhan
Lu, Jing
Zou, Baorui
Qiao, Liang
Xu, Yunlu
Pu, Shiliang
Niu, Yi
Wu, Fei
Zhou, Shuigeng
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 822 - 837
[26] End-to-End Chinese Image Text Recognition with Attention Model
Sheng, Fenfen
Zhai, Chuanlei
Chen, Zhineng
Xu, Bo
NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 180 - 189
[27] Multitask Training with Text Data for End-to-End Speech Recognition
Wang, Peidong
Sainath, Tara N.
Weiss, Ron J.
INTERSPEECH 2021, 2021, : 2566 - 2570
[28] Robust End-to-End Offline Chinese Handwriting Text Page Spotter with Text Kernel
Wang, Zhihao
Yu, Yanwei
Wang, Yibo
Long, Haixu
Wang, Fazheng
DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT II, 2021, 12917 : 21 - 35
[29] EXPLICIT ALIGNMENT OF TEXT AND SPEECH ENCODINGS FOR ATTENTION-BASED END-TO-END SPEECH RECOGNITION
Drexler, Jennifer
Glass, James
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 913 - 919
[30] End-to-end DNN based text-independent speaker recognition for long and short utterances
Rohdin, Johan
Silnova, Anna
Diez, Mireia
Plchot, Oldrich
Matejka, Pavel
Burget, Lukas
Glembek, Ondrej
COMPUTER SPEECH AND LANGUAGE, 2020, 59 : 22 - 35

← 1 2 3 4 5 →