Structural feature-based evaluation method of binarization techniques for word retrieval in the degraded Arabic document images

被引:0
作者
Toufik Sari
Abderrahmane Kefali
Halima Bahi
机构
[1] Badji Mokhtar University,LabGED Laboratory
来源
International Journal on Document Analysis and Recognition (IJDAR) | 2016年 / 19卷
关键词
Binarization; Word spotting; Document retrieval ; Thresholding evaluation; Edit distance; Performance assessment;
D O I
暂无
中图分类号
学科分类号
摘要
One of the most important and necessary steps in the process of document analysis and recognition is the binarization, which allows extracting the foreground from the background. Several binarization techniques have been proposed in the literature, but none of them was reliable for all image types. This makes the selection of one method to apply in a given application very difficult. Thus, performance evaluation of binarization algorithms becomes therefore vital. In this paper, we are interested in the evaluation of binarization techniques for the purpose of retrieving words from the images of degraded Arabic documents. A new evaluation methodology is proposed. The proposed evaluation methodology is based on the comparison of the visual features extracted from the binarized document images with ground truth features instead of comparing images between themselves. The most appropriate thresholding method for each image is the one for which the visual features of the identified words in the image are “closer” to the features of the reference words. The proposed technique was used here to assess the performances of eleven algorithms based on different approaches on a collection of real and synthetic images.
引用
收藏
页码:31 / 47
页数:16
相关论文
共 39 条
[1]  
Goyal R(2011)A review of optimal binarization techniques on documents with damaged background Int. J. Comput. Sci. Technol. 2 237-239
[2]  
Kaur A(2007)OCR binarization and image pre-processing for searching historical documents Pattern Recognit. 40 389-397
[3]  
Gupta MR(1993)Extraction of binary character/graphics images from grayscale document images CVGIP Comput. Vis. Graph. Image Process. 55 203-217
[4]  
Jacobson NP(1985)A new method for gray-level picture threshold using the entropy of the histogram Comput. Vis. Graph. Image Process. 29 273-285
[5]  
Garcia EK(2014)Foreground–background separation by feed-forward neural networks in old manuscripts Informatica 38 329-338
[6]  
Kamel M(2002)Document image binarization based on topographic analysis using a water flow model Pattern Recognit. 35 265-277
[7]  
Zhao A(2004)Distance–reciprocal distortion measure for binary document images IEEE Signal Process. Lett. 11 228-231
[8]  
Kapur JN(1979)A threshold selection method from gray-level histograms IEEE Trans. Syst. Man Cybern. 9 62-66
[9]  
Sahoo PK(1986)Document image binarization: evaluation of algorithms Proc. Soc. Photo-Opt. Instrum. Eng. 697 278-285
[10]  
Wong AKC(1988)A survey of thresholding techniques Comput. Vis. Graph. Image Process. 41 233-260