A robust approach for text detection from natural scene images

被引：73

作者：

Sun, Lei ^{[1
]}

Huo, Qiang ^{[2
]}

Jia, Wei ^{[3
]}

Chen, Kai ^{[1
]}

机构：

[1] Univ Sci & Technol China, Dept Elect Sci & Technol, Hefei 230026, Peoples R China

[2] Microsoft Res Asia, Beijing, Peoples R China

[3] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, Hefei 230026, Peoples R China

来源：

PATTERN RECOGNITION | 2015年 / 48卷 / 09期

关键词：

Text detection; Natural scene images; Color-enhanced contrasting extremal region; Neural networks; COMPONENT-TREE; EXTRACTION; VIDEO;

D O I：

10.1016/j.patcog.2015.04.002

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a robust text detection approach based on color-enhanced contrasting extremal region (CER) and neural networks. Given a color natural scene image, six component-trees are built from its grayscale image, hue and saturation channel images in a perception-based illumination invariant color space, and their inverted images respectively. From each component-tree, color-enhanced CERs are extracted as character candidates. By using a "divide-and-conquer" strategy, each candidate image patch is labeled reliably by rules as one of five types, namely, Long, Thin, Fill, Square-large and Square-small, and classified as text or non-text by a corresponding neural network, which is trained by an ambiguity-free learning strategy. After pruning unambiguous non-text components, repeating components in each component-tree are pruned further. Remaining components are then grouped into candidate text-lines and verified by another set of neural networks. Finally, results from six component-trees are combined, and a post-processing step is used to recover lost characters. Our proposed method achieves superior performance on both ICDAR-2011 and ICDAR-2013 "Reading Text in Scene Images" "test sets. (C) 2015 Elsevier Ltd. All rights reserved.

引用

页码：2906 / 2920

页数：15

共 50 条

[1] [Anonymous], P SIGGRAPH
[2] [Anonymous], P CORR
[3] [Anonymous], 2011, P 4 INT WORKSH CAM B
[4] [Anonymous], DETEVAL EVALUATION S
[5] [Anonymous], P ICPR
[6] [Anonymous], 2014, P CVPR
[7] [Anonymous], 2011, INT WORKSH CAM BAS D
[8] [Anonymous], 2014, P ICLR
[9] Representation Learning: A Review and New Perspectives
Bengio, Yoshua
Courville, Aaron
Vincent, Pascal
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) : 1798 - 1828
[10] PhotoOCR: Reading Text in Uncontrolled Conditions
Bissacco, Alessandro
Cummins, Mark
Netzer, Yuval
Neven, Hartmut
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 785 - 792

← 1 2 3 4 5 →