Text String Detection From Natural Scenes by Structure-Based Partition and Grouping

被引:201
作者
Yi, Chucai [1 ]
Tian, YingLi [2 ]
机构
[1] CUNY, Grad Ctr, New York, NY 10016 USA
[2] IBM Corp, TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Adjacent character grouping; character property; image partition; text line grouping; text string detection; text string structure; EXTRACTION; SEGMENTATION;
D O I
10.1109/TIP.2011.2126586
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text information in natural scene images serves as important clues for many image-based applications such as scene understanding, content-based image retrieval, assistive navigation, and automatic geocoding. However, locating text from a complex background with multiple colors is a challenging task. In this paper, we explore a new framework to detect text strings with arbitrary orientations in complex natural scene images. Our proposed framework of text string detection consists of two steps: 1) image partition to find text character candidates based on local gradient features and color uniformity of character components and 2) character candidate grouping to detect text strings based on joint structural features of text characters in each text string such as character size differences, distances between neighboring characters, and character alignment. By assuming that a text string has at least three characters, we propose two algorithms of text string detection: 1) adjacent character grouping method and 2) text line grouping method. The adjacent character grouping method calculates the sibling groups of each character candidate as string segments and then merges the intersecting sibling groups into text string. The text line grouping method performs Hough transform to fit text line among the centroids of text candidates. Each fitted text line describes the orientation of a potential text string. The detected text string is presented by a rectangle region covering all characters whose centroids are cascaded in its text line. To improve efficiency and accuracy, our algorithms are carried out in multi-scales. The proposed methods outperform the state-of-the-art results on the public Robust Reading Dataset, which contains text only in horizontal orientation. Furthermore, the effectiveness of our methods to detect text strings with arbitrary orientations is evaluated on the Oriented Scene Text Dataset collected by ourselves containing text strings in nonhorizontal orientations.
引用
收藏
页码:2594 / 2605
页数:12
相关论文
共 39 条
  • [1] [Anonymous], INT J DOC ANAL RECOG
  • [2] Banerjee J, 2009, PROC CVPR IEEE, P517, DOI 10.1109/CVPRW.2009.5206601
  • [3] Breuel T. M., 2008, P IS T SPIE 20 ANN S, P1
  • [4] Burns TJ, 2009, PROC CVPR IEEE, P1287, DOI 10.1109/CVPRW.2009.5206606
  • [5] Automatic detection and recognition of signs from natural scenes
    Chen, XL
    Yang, J
    Zhang, J
    Waibel, A
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2004, 13 (01) : 87 - 99
  • [6] Chen XR, 2004, PROC CVPR IEEE, P366
  • [7] Epshtein B, 2010, PROC CVPR IEEE, P2963, DOI 10.1109/CVPR.2010.5540041
  • [8] Gao J, 2001, PROC CVPR IEEE, P84
  • [9] Morphological text extraction from images
    Hasan, YMY
    Karam, LJ
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2000, 9 (11) : 1978 - 1983
  • [10] Ho WT, 2008, INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS, P899