A robust approach for text detection from natural scene images

被引:73
作者
Sun, Lei [1 ]
Huo, Qiang [2 ]
Jia, Wei [3 ]
Chen, Kai [1 ]
机构
[1] Univ Sci & Technol China, Dept Elect Sci & Technol, Hefei 230026, Peoples R China
[2] Microsoft Res Asia, Beijing, Peoples R China
[3] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, Hefei 230026, Peoples R China
关键词
Text detection; Natural scene images; Color-enhanced contrasting extremal region; Neural networks; COMPONENT-TREE; EXTRACTION; VIDEO;
D O I
10.1016/j.patcog.2015.04.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a robust text detection approach based on color-enhanced contrasting extremal region (CER) and neural networks. Given a color natural scene image, six component-trees are built from its grayscale image, hue and saturation channel images in a perception-based illumination invariant color space, and their inverted images respectively. From each component-tree, color-enhanced CERs are extracted as character candidates. By using a "divide-and-conquer" strategy, each candidate image patch is labeled reliably by rules as one of five types, namely, Long, Thin, Fill, Square-large and Square-small, and classified as text or non-text by a corresponding neural network, which is trained by an ambiguity-free learning strategy. After pruning unambiguous non-text components, repeating components in each component-tree are pruned further. Remaining components are then grouped into candidate text-lines and verified by another set of neural networks. Finally, results from six component-trees are combined, and a post-processing step is used to recover lost characters. Our proposed method achieves superior performance on both ICDAR-2011 and ICDAR-2013 "Reading Text in Scene Images" "test sets. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:2906 / 2920
页数:15
相关论文
共 50 条
  • [1] [Anonymous], P SIGGRAPH
  • [2] [Anonymous], P CORR
  • [3] [Anonymous], 2011, P 4 INT WORKSH CAM B
  • [4] [Anonymous], DETEVAL EVALUATION S
  • [5] [Anonymous], P ICPR
  • [6] [Anonymous], 2014, P CVPR
  • [7] [Anonymous], 2011, INT WORKSH CAM BAS D
  • [8] [Anonymous], 2014, P ICLR
  • [9] Representation Learning: A Review and New Perspectives
    Bengio, Yoshua
    Courville, Aaron
    Vincent, Pascal
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) : 1798 - 1828
  • [10] PhotoOCR: Reading Text in Uncontrolled Conditions
    Bissacco, Alessandro
    Cummins, Mark
    Netzer, Yuval
    Neven, Hartmut
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 785 - 792