Multi-Scale Scene Text Detection Based on Convolutional Neural Network

被引:0
作者
Lu, Yan-Feng [1 ]
Zhang, Ai-Xuan [2 ]
Li, Yi [3 ]
Yu, Qian-Hui [4 ]
Qiao, Hong [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing, Peoples R China
[2] China Acad Aerosp, Standardizat & Prod Assurance, Beijing, Peoples R China
[3] Nanchang Univ, Sch Informat Engn, Nanchang, Jiangxi, Peoples R China
[4] Harbin Engn Univ, Coll Informat & Commun Engn, Harbin, Peoples R China
来源
2019 CHINESE AUTOMATION CONGRESS (CAC2019) | 2019年
基金
中国国家自然科学基金;
关键词
deep learning; natural scene; teal detection; convolutional neural network; pyramid network;
D O I
10.1109/cac48633.2019.8996635
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Faster R-CNN has advantages in object detection task. But in face of the variability of text and interference of the external factors, it cannot achieve perfect detection results in natural scene text detection. Moreover, the text detection algorithms based on deep learning need to use large data sets to train the network, while in some special scenarios where a mass of samples cannot be obtained, the performance of these algorithms is likely to be limited. How to accurately detect text in natural scene based on small data sets is a challenging issue. To address this issue, a multi-scale text feature extraction network with feature pyramid based on Faster R-CNN is proposed, which can accurately and comprehensively express complex and changeable text features in natural scenes even in the small data cases. Experiment results show that the proposed MSTD method is very competitive with existing related architectures.
引用
收藏
页码:583 / 587
页数:5
相关论文
共 27 条
  • [1] Adam C, 2011, ICDAR, V11
  • [2] [Anonymous], 2015, Arxiv.Org, DOI DOI 10.3389/FPSYG.2013.00124
  • [3] [Anonymous], 2014, INT C LEARNING REPRE
  • [4] Dai JF, 2016, ADV NEUR IN, V29
  • [5] Epshtein B, 2010, PROC CVPR IEEE, P2963, DOI 10.1109/CVPR.2010.5540041
  • [6] Fast R-CNN
    Girshick, Ross
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448
  • [7] Huang P, 2016, TEXT RECOGNITION NAT
  • [8] ICDAR 2013 Robust Reading Competition
    Karatzas, Dimosthenis
    Shafait, Faisal
    Uchida, Seiichi
    Iwamura, Masakazu
    Gomez i Bigorda, Lluis
    Robles Mestre, Sergi
    Mas, Joan
    Fernandez Mota, David
    Almazan Almazan, Jon
    Pere de las Heras, Lluis
    [J]. 2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 1484 - 1493
  • [9] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
  • [10] Feature Pyramid Networks for Object Detection
    Lin, Tsung-Yi
    Dollar, Piotr
    Girshick, Ross
    He, Kaiming
    Hariharan, Bharath
    Belongie, Serge
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 936 - 944