MORAN: A Multi-Object Rectified Attention Network for scene text recognition

被引:318
作者
Luo, Canjie [1 ]
Jin, Lianwen [1 ,2 ]
Sun, Zenghui [1 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou, Guangdong, Peoples R China
[2] SCUT Zhuhai Inst Modern Ind Innovat, Zhuhai, Peoples R China
基金
国家重点研发计划;
关键词
Scene text recognition; Optical character recognition; Deep learning;
D O I
10.1016/j.patcog.2019.01.020
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Irregular text is widely used. However, it is considerably difficult to recognize because of its various shapes and distorted patterns. In this paper, we thus propose a multi-object rectified attention network (MORAN) for general scene text recognition. The MORAN consists of a multi-object rectification network and an attention-based sequence recognition network. The multi-object rectification network is designed for rectifying images that contain irregular text. It decreases the difficulty of recognition and enables the attention-based sequence recognition network to more easily read irregular text. It is trained in a weak supervision way, thus requiring only images and corresponding text labels. The attention-based sequence recognition network focuses on target characters and sequentially outputs the predictions. Moreover, to improve sensitivity of the attention-based sequence recognition network, a fractional pickup method is proposed for an attention-based decoder in the training phase. With the rectification mechanism, the MORAN can read both regular and irregular scene text. Extensive experiments on various benchmarks are conducted, which show that the MORAN achieves state-of-the-art performance. The source code is available.(1) (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:109 / 118
页数:10
相关论文
共 57 条
[51]  
Wang T, 2012, INT C PATT RECOG, P3304
[52]  
Yang X, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P3280
[53]   Strokelets: A Learned Multi-Scale Representation for Scene Text Recognition [J].
Yao, Cong ;
Bai, Xiang ;
Shi, Baoguang ;
Liu, Wenyu .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :4042-4049
[54]   Text Detection and Recognition in Imagery: A Survey [J].
Ye, Qixiang ;
Doermann, David .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (07) :1480-1500
[55]  
Yin Fei, 2017, ARXIV PREPRINT ARXIV
[56]   Could scene context be beneficial for scene text detection? [J].
Zhu, Anna ;
Gao, Renwu ;
Uchida, Seiichi .
PATTERN RECOGNITION, 2016, 58 :204-215
[57]   Scene text detection and recognition: recent advances and future trends [J].
Zhu, Yingying ;
Yao, Cong ;
Bai, Xiang .
FRONTIERS OF COMPUTER SCIENCE, 2016, 10 (01) :19-36