Towards Visual Words to Words Text Detection with a General Bag of Words Representation

被引:0
作者
Mehta, Rakesh [1 ]
Chum, Ondrej [2 ]
Matas, Jiri [2 ]
机构
[1] Tampere Univ Technol, Dept Signal Proc, FIN-33101 Tampere, Finland
[2] Czech Tech Univ, Fac Elect Engn, Ctr Machine Pecept, Dept Cybernet, Prague, Czech Republic
来源
2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR) | 2015年
关键词
IMAGES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We address the problem of text localization and retrieval in real world images. We are first to study the retrieval of text images, i.e. the selection of images containing text in large collections at high speed. We propose a novel representation, textual visual words, which describe text by generic visual words that geometrically consistently predict bottom and top lines of text. The visual words are discretized SIFT descriptors of Hessian features. The features may correspond to various structures present in the text - character fragments, individual characters or their arrangements. The textual words representation is invariant to affine transformation of the image and local linear change of intensity. Experiments demonstrate that the proposed method outperforms the state-of-the-art on the MS dataset. The proposed method detects blurry, small font, low contrast, noisy text from real world images.
引用
收藏
页码:641 / 645
页数:5
相关论文
共 19 条
  • [1] [Anonymous], 1981, COMMUNICATIONS ACM
  • [2] Chen X., 2004, CVPR, V2, P11
  • [3] Total recall: Automatic query expansion with a generative feature model for object retrieval
    Chum, Ondrej
    Philbin, James
    Sivic, Josef
    Isard, Michael
    Zisserman, Andrew
    [J]. 2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, : 496 - +
  • [4] Epshtein B, 2010, PROC CVPR IEEE, P2963, DOI 10.1109/CVPR.2010.5540041
  • [5] Jegou H, 2008, LECT NOTES COMPUT SC, V5302, P304, DOI 10.1007/978-3-540-88682-2_24
  • [6] ICDAR 2013 Robust Reading Competition
    Karatzas, Dimosthenis
    Shafait, Faisal
    Uchida, Seiichi
    Iwamura, Masakazu
    Gomez i Bigorda, Lluis
    Robles Mestre, Sergi
    Mas, Joan
    Fernandez Mota, David
    Almazan Almazan, Jon
    Pere de las Heras, Lluis
    [J]. 2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 1484 - 1493
  • [7] AdaBoost for Text Detection in Natural Scene
    Lee, Jung-Jin
    Lee, Pyoung-Hean
    Lee, Seong-Whan
    Yuille, Alan
    Koch, Christof
    [J]. 11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 429 - 434
  • [8] Localizing and segmenting text in images and videos
    Lienhart, R
    Wernicke, A
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2002, 12 (04) : 256 - 268
  • [9] Distinctive image features from scale-invariant keypoints
    Lowe, DG
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 60 (02) : 91 - 110
  • [10] Lucas SM, 2003, PROC INT CONF DOC, P682