Unconstrained end-to-end text reading with feature rectification

被引:2
作者
Du, Chen [1 ,2 ]
Wang, Yanna [1 ]
Wang, Chunheng [1 ]
Xiao, Baihua [1 ]
Shi, Cunzhao [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Text recognition; Text detection; Position-sensitive network; Features incompatibility; End-to-end; NETWORK;
D O I
10.1016/j.patrec.2021.05.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose an end-to-end trainable network that can simultaneously localize and recognize irregular text from images. Specifically, we find the feature incompatibility problem, which arises from the contradiction between detection and recognition tasks for feature extraction of the convolutional neural network, and propose to introduce the larger-scale features for the recognition part to improve the accuracy of recognition instead of using the same feature with the detection. To extract effective text features for perspective and curved text recognition, we propose a position-sensitive network to rectify the text proposal features in the recognition branch. The position-sensitive network, which is trained in a weak supervision way, takes the proposal detection feature as input and outputs the feature rectification information. Experiments demonstrate that the proposed method can achieve state-of-the-art or highly competitive performance compared with baselines on a number of benchmarks. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 46 条
  • [1] Character Region Awareness for Text Detection
    Baek, Youngmin
    Lee, Bado
    Han, Dongyoon
    Yun, Sangdoo
    Lee, Hwalsuk
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9357 - 9366
  • [2] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework
    Busta, Michal
    Neumann, Lukas
    Matas, Jiri
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2223 - 2231
  • [3] Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition
    Ch'ng, Chee Kheng
    Chan, Chee Seng
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 935 - 942
  • [4] Chen Du, 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR). Proceedings, P375, DOI 10.1109/ICDAR.2019.00067
  • [5] Focusing Attention: Towards Accurate Text Recognition in Natural Images
    Cheng, Zhanzhan
    Bai, Fan
    Xu, Yunlu
    Zheng, Gang
    Pu, Shiliang
    Zhou, Shuigeng
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5086 - 5094
  • [6] Dai J, 2016, PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), P1796, DOI 10.1109/ICIT.2016.7475036
  • [7] Dai YC, 2018, INT C PATT RECOG, P3604, DOI 10.1109/ICPR.2018.8546066
  • [8] Deng D, 2018, AAAI CONF ARTIF INTE, P6773
  • [9] Selective feature connection mechanism: Concatenating multi-layer CNN features with a feature selector
    Du, Chen
    Wang, Chunheng
    Wang, Yanna
    Shi, Cunzhao
    Xiao, Baihua
    [J]. PATTERN RECOGNITION LETTERS, 2020, 129 : 108 - 114
  • [10] Visual attention models for scene text recognition
    Ghosh, Suman K.
    Valveny, Ernest
    Bagdanov, Andrew D.
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 943 - 948