Multi-Scale Scene Text Detection Based on Convolutional Neural Network

被引:0
作者
Lu, Yan-Feng [1 ]
Zhang, Ai-Xuan [2 ]
Li, Yi [3 ]
Yu, Qian-Hui [4 ]
Qiao, Hong [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing, Peoples R China
[2] China Acad Aerosp, Standardizat & Prod Assurance, Beijing, Peoples R China
[3] Nanchang Univ, Sch Informat Engn, Nanchang, Jiangxi, Peoples R China
[4] Harbin Engn Univ, Coll Informat & Commun Engn, Harbin, Peoples R China
来源
2019 CHINESE AUTOMATION CONGRESS (CAC2019) | 2019年
基金
中国国家自然科学基金;
关键词
deep learning; natural scene; teal detection; convolutional neural network; pyramid network;
D O I
10.1109/cac48633.2019.8996635
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Faster R-CNN has advantages in object detection task. But in face of the variability of text and interference of the external factors, it cannot achieve perfect detection results in natural scene text detection. Moreover, the text detection algorithms based on deep learning need to use large data sets to train the network, while in some special scenarios where a mass of samples cannot be obtained, the performance of these algorithms is likely to be limited. How to accurately detect text in natural scene based on small data sets is a challenging issue. To address this issue, a multi-scale text feature extraction network with feature pyramid based on Faster R-CNN is proposed, which can accurately and comprehensively express complex and changeable text features in natural scenes even in the small data cases. Experiment results show that the proposed MSTD method is very competitive with existing related architectures.
引用
收藏
页码:583 / 587
页数:5
相关论文
共 27 条
[1]  
Adam C, 2011, ICDAR, V11
[2]  
[Anonymous], 2015, Arxiv.Org, DOI DOI 10.3389/FPSYG.2013.00124
[3]  
[Anonymous], 2014, INT C LEARNING REPRE
[4]  
Dai JF, 2016, ADV NEUR IN, V29
[5]  
Epshtein B, 2010, PROC CVPR IEEE, P2963, DOI 10.1109/CVPR.2010.5540041
[6]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[7]  
Huang P, 2016, TEXT RECOGNITION NAT
[8]   ICDAR 2013 Robust Reading Competition [J].
Karatzas, Dimosthenis ;
Shafait, Faisal ;
Uchida, Seiichi ;
Iwamura, Masakazu ;
Gomez i Bigorda, Lluis ;
Robles Mestre, Sergi ;
Mas, Joan ;
Fernandez Mota, David ;
Almazan Almazan, Jon ;
Pere de las Heras, Lluis .
2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, :1484-1493
[9]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90
[10]   Feature Pyramid Networks for Object Detection [J].
Lin, Tsung-Yi ;
Dollar, Piotr ;
Girshick, Ross ;
He, Kaiming ;
Hariharan, Bharath ;
Belongie, Serge .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :936-944