Unconstrained end-to-end text reading with feature rectification

被引：2

作者：

Du, Chen ^{[1
,2
]}

Wang, Yanna ^{[1
]}

Wang, Chunheng ^{[1
]}

Xiao, Baihua ^{[1
]}

Shi, Cunzhao ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

来源：

PATTERN RECOGNITION LETTERS | 2021年 / 149卷

基金：

中国国家自然科学基金;

关键词：

Text recognition; Text detection; Position-sensitive network; Features incompatibility; End-to-end; NETWORK;

D O I：

10.1016/j.patrec.2021.05.017

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose an end-to-end trainable network that can simultaneously localize and recognize irregular text from images. Specifically, we find the feature incompatibility problem, which arises from the contradiction between detection and recognition tasks for feature extraction of the convolutional neural network, and propose to introduce the larger-scale features for the recognition part to improve the accuracy of recognition instead of using the same feature with the detection. To extract effective text features for perspective and curved text recognition, we propose a position-sensitive network to rectify the text proposal features in the recognition branch. The position-sensitive network, which is trained in a weak supervision way, takes the proposal detection feature as input and outputs the feature rectification information. Experiments demonstrate that the proposed method can achieve state-of-the-art or highly competitive performance compared with baselines on a number of benchmarks. (c) 2021 Elsevier B.V. All rights reserved.

引用

页码：1 / 8

页数：8

共 46 条

[1] Character Region Awareness for Text Detection
Baek, Youngmin
Lee, Bado
Han, Dongyoon
Yun, Sangdoo
Lee, Hwalsuk
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9357 - 9366
[2] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework
Busta, Michal
Neumann, Lukas
Matas, Jiri
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2223 - 2231
[3] Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition
Ch'ng, Chee Kheng
Chan, Chee Seng
[J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 935 - 942
[4] Chen Du, 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR). Proceedings, P375, DOI 10.1109/ICDAR.2019.00067
[5] Focusing Attention: Towards Accurate Text Recognition in Natural Images
Cheng, Zhanzhan
Bai, Fan
Xu, Yunlu
Zheng, Gang
Pu, Shiliang
Zhou, Shuigeng
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5086 - 5094
[6] Dai J, 2016, PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), P1796, DOI 10.1109/ICIT.2016.7475036
[7] Dai YC, 2018, INT C PATT RECOG, P3604, DOI 10.1109/ICPR.2018.8546066
[8] Deng D, 2018, AAAI CONF ARTIF INTE, P6773
[9] Selective feature connection mechanism: Concatenating multi-layer CNN features with a feature selector
Du, Chen
Wang, Chunheng
Wang, Yanna
Shi, Cunzhao
Xiao, Baihua
[J]. PATTERN RECOGNITION LETTERS, 2020, 129 : 108 - 114
[10] Visual attention models for scene text recognition
Ghosh, Suman K.
Valveny, Ernest
Bagdanov, Andrew D.
[J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 943 - 948

← 1 2 3 4 5 →