Detection and rectification of arbitrary shaped scene texts by using text keypoints and links

被引:9
作者
Xue, Chuhui [1 ]
Lu, Shijian [1 ]
Hoi, Steven [2 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[2] Salesforce, Singapore, Singapore
关键词
Scene text detection; Scene text recognition; Deep learning; Neural network;
D O I
10.1016/j.patcog.2021.108494
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detection and recognition of scene texts of arbitrary shapes remain a grand challenge due to the super-rich text shape variation in text line orientations, lengths, curvatures, etc. This paper presents a mask-guided multi-task network that detects and rectifies scene texts of arbitrary shapes reliably. Three types of keypoints are detected which specify the centre line and so the shape of text instances accurately. In addition, four types of keypoint links are detected of which the horizontal links associate the detected keypoints of each text instance and the vertical links predict a pair of landmark points (for each key-point) along the upper and lower text boundary, respectively. Scene texts can be located and rectified by linking up the associated landmark points (giving localization polygon boxes) and transforming the polygon boxes via thin plate spline, respectively. Extensive experiments over several public datasets show that the use of text keypoints is tolerant to the variation in text orientations, lengths, and curvatures, and it achieves competitive scene text detection and rectification performance as compared with state-of-the -art methods. (c) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:10
相关论文
共 45 条
  • [1] [Anonymous], 2017, ARXIV171202170
  • [2] Character Region Awareness for Text Detection
    Baek, Youngmin
    Lee, Bado
    Han, Dongyoon
    Yun, Sangdoo
    Lee, Hwalsuk
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9357 - 9366
  • [4] Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition
    Ch'ng, Chee Kheng
    Chan, Chee Seng
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 935 - 942
  • [5] Chee Kheng Chng, 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR). Proceedings, P1571, DOI 10.1109/ICDAR.2019.00252
  • [6] Deng D, 2018, AAAI CONF ARTIF INTE, P6773
  • [7] Synthetic Data for Text Localisation in Natural Images
    Gupta, Ankush
    Vedaldi, Andrea
    Zisserman, Andrew
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2315 - 2324
  • [8] Realtime multi-scale scene text detection with scale-based region proposal network
    He, Wenhao
    Zhang, Xu-Yao
    Yin, Fei
    Luo, Zhenbo
    Ogier, Jean-Marc
    Liu, Cheng-Lin
    [J]. PATTERN RECOGNITION, 2020, 98
  • [9] CornerNet: Detecting Objects as Paired Keypoints
    Law, Hei
    Deng, Jia
    [J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 765 - 781
  • [10] Liao MH, 2020, AAAI CONF ARTIF INTE, V34, P11474