Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes

被引:170
作者
Zhang, Chengquan [1 ]
Liang, Borong [2 ]
Huang, Zuming [1 ]
En, Mengyi [1 ]
Han, Junyu [1 ]
Ding, Errui [1 ]
Ding, Xinghao [2 ]
机构
[1] Baidu Inc, Dept Comp Vis Technol Vis, Beijing, Peoples R China
[2] Xiamen Univ, Fujian Key Lab Sensing & Comp Smart City, Xiamen, Peoples R China
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR.2019.01080
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous scene text detection methods have progressed substantially over the past years. However, limited by the receptive field of CNNs and the simple representations like rectangle bounding box or quadrangle adopted to describe text, previous methods may fall short when dealing with more challenging text instances, such as extremely long text and arbitrarily shaped text. To address these two problems, we present a novel text detector namely LOMO, which localizes the text progressively for multiple times (or in other word, LOok More than Once). LOMO consists of a direct regressor (DR), an iterative refinement module (IRM) and a shape expression module (SEM). At first, text proposals in the form of quadrangle are generated by DR branch. Next, IRM progressively perceives the entire long text by iterative refinement based on the extracted feature blocks of preliminary proposals. Finally, a SEM is introduced to reconstruct more precise representation of irregular text by considering the geometry properties of text instance, including text region, text center line and border offsets. The state-of-the-art results on several public benchmarks including ICDAR2017-RCTW, SCUT-CTW1500, Total-Text, ICDAR2015 and ICDAR17-MLT confirm the striking robustness and effectiveness of LOMO.
引用
收藏
页码:10544 / 10553
页数:10
相关论文
共 43 条
  • [1] [Anonymous], 2018, ARXIV180602559
  • [2] [Anonymous], 2017, CVPR, DOI DOI 10.1109/CVPR.2017.371
  • [3] [Anonymous], 2018, ARXIV180409003
  • [4] [Anonymous], ICCV
  • [5] [Anonymous], 2011, ACCV
  • [6] [Anonymous], 2018, ARXIV180109969
  • [7] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework
    Busta, Michal
    Neumann, Lukas
    Matas, Jiri
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2223 - 2231
  • [8] Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition
    Ch'ng, Chee Kheng
    Chan, Chee Seng
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 935 - 942
  • [9] Deng D, 2018, AAAI CONF ARTIF INTE, P6773
  • [10] Duan ZR, 2008, SER INF MANAGE SCI, V7, P1