Integrated Text Detection and Recognition in Natural Images

被引:2
作者
Roubtsova, Nadejda S. [1 ]
Wijnhoven, Rob G. J.
de With, Peter H. N. [1 ,2 ]
机构
[1] Eindhoven Univ Technol, Dep Elect Engn, POB 513, NL-5600 MB Eindhoven, Netherlands
[2] ViNotion Ltd, Amersfoort, Netherlands
来源
IMAGE PROCESSING: ALGORITHMS AND SYSTEMS X AND PARALLEL PROCESSING FOR IMAGING APPLICATIONS II | 2012年 / 8295卷
关键词
text detection; OCR; process integration; text characterization;
D O I
10.1117/12.906761
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Text detection and recognition in natural images have conventionally been seen in the prior art as autonomous tasks executed in a strictly sequential processing chain with limited information sharing between sub-systems. This approach is flawed because it introduces (1) redundancy in extracting the same text properties multiple times and (2) error by prohibiting verification of hard (often binarized) detection results at later stages. We explore the possibilities for integration of detection and recognition modules by a feedforward multidimensional information stream. Integration involves suitable characterization of the text string at detection and application of the knowledge to ease recognition by a given OCR system. The choice of characterization properties generally depends on the OCR system, although some of them have proven universally applicable. We show that the proposed integration measures enable more robust recognition of text in complex, unconstrained natural environments. Specifically, integration by the proposed measures (1) eliminates textual input irregularities that recognition engines cannot handle and (2) adaptively tunes the recognition stage for each input image. The former function boosts correct detections, while the latter mainly reduces the number of false positives. Our validation experiments on a set of low-quality natural images show that adaptively tuning the OCR stage to the typical text-to-background transitions in the input image (gradient significance profiling) allows to attain an improvement of 29% in the precision-recall performance, mostly through boosting precision.
引用
收藏
页数:21
相关论文
共 33 条
[1]  
Breuel T.M., 2008, P ISTSPIE 20 ANN S, V6815
[2]   A survey of methods and strategies in character segmentation [J].
Casey, RG ;
Lecolinet, E .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1996, 18 (07) :690-706
[3]  
Chen XR, 2004, PROC CVPR IEEE, P366
[4]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[5]  
de Campos TE, 2009, VISAPP 2009: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, P273
[6]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[7]  
Donoser M, 2007, LECT NOTES COMPUT SC, V4844, P447
[8]  
Epshtein B, 2010, PROC CVPR IEEE, P2963, DOI 10.1109/CVPR.2010.5540041
[9]   Improved text-detection methods for a camera-based text reading system for blind persons [J].
Ezaki, N ;
Kiyota, K ;
Minh, BT ;
Bulacu, M ;
Schomaker, L .
EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, :257-261
[10]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139