Text detection in street level images

被引:33
作者
Fabrizio, Jonathan [1 ]
Marcotegui, Beatriz [2 ]
Cord, Matthieu [3 ]
机构
[1] LRDE EPITA Lab, F-94276 Le Kremlin Bicetre, France
[2] Mines ParisTech, CMM Ctr Morphol Math Math & Syst, F-77305 Fontainebleau, France
[3] UPMC Sorbonne Univ, Lab LIP6, F-75005 Paris, France
关键词
Text detection; Text segmentation; TMMS; Toggle mapping; Image classification; EXTRACTION; SEGMENTATION; VIDEO; LOCALIZATION; RECOGNITION;
D O I
10.1007/s10044-013-0329-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text detection system for natural images is a very challenging task in Computer Vision. Image acquisition introduces distortion in terms of perspective, blurring, illumination, and characters which may have very different shape, size, and color. We introduce in this article a full text detection scheme. Our architecture is based on a new process to combine a hypothesis generation step to get potential boxes of text and a hypothesis validation step to filter false detections. The hypothesis generation process relies on a new efficient segmentation method based on a morphological operator. Regions are then filtered and classified using shape descriptors based on Fourier, Pseudo Zernike moments and an original polar descriptor, which is invariant to rotation. Classification process relies on three SVM classifiers combined in a late fusion scheme. Detected characters are finally grouped to generate our text box hypotheses. Validation step is based on a global SVM classification of the box content using dedicated descriptors adapted from the HOG approach. Results on the well-known ICDAR database are reported showing that our method is competitive. Evaluation protocol and metrics are deeply discussed and results on a very challenging street-level database are also proposed.
引用
收藏
页码:519 / 533
页数:15
相关论文
共 51 条
[31]   Text information extraction in images and video: a survey [J].
Jung, K ;
Kim, KI ;
Jain, AK .
PATTERN RECOGNITION, 2004, 37 (05) :977-997
[32]  
Kavallieratou E, 2001, IEEE IMAGE PROC, P1102, DOI 10.1109/ICIP.2001.959242
[33]   Automatic text segmentation and text recognition for video indexing [J].
Lienhart, R ;
Effelsberg, W .
MULTIMEDIA SYSTEMS, 2000, 8 (01) :69-81
[34]   Stroke filter for text localization in video images [J].
Liu, Qifeng ;
Jung, Cheolkon ;
Kim, Sangkyun ;
Moon, Youngsoo ;
Kim, Ji-Yeun .
2006 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP 2006, PROCEEDINGS, 2006, :1473-+
[35]   Multiscale edge-based text extraction from complex images [J].
Liu, Xiaoqing ;
Samarabandu, Jagath .
2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, :1721-+
[36]  
Niblack W., 1986, An Introduction to Image Processing
[37]   THRESHOLD SELECTION METHOD FROM GRAY-LEVEL HISTOGRAMS [J].
OTSU, N .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1979, 9 (01) :62-66
[38]   POSTAL ADDRESS BLOCK LOCATION IN REAL-TIME [J].
PALUMBO, PW ;
SRIHARI, SN ;
SOH, J ;
SRIDHAR, R ;
DEMJANENKO, V .
COMPUTER, 1992, 25 (07) :34-42
[39]  
Pazio Marcin, 2007, 2007 15th European Signal Processing Conference (EUSIPCO), P272
[40]  
Retornaz T., 2007, Proceedings of the International Symposium on Mathematical Morphology, V1, P177