Binarization of Degraded Document Images with Generalized Gaussian Distribution

被引:5
作者
Krupinski, Robert [1 ]
Lech, Piotr [1 ]
Teclaw, Mateusz [1 ]
Okarma, Krzysztof [1 ]
机构
[1] West Pomeranian Univ Technol, Fac Elect Engn, Dept Signal Proc & Multimedia Engn, Sikorskiego 37, PL-70313 Szczecin, Poland
来源
COMPUTATIONAL SCIENCE - ICCS 2019, PT V | 2019年 / 11540卷
关键词
Document images; Image binarization; Generalized Gaussian Distribution; Monte Carlo method; Thresholding; RECOGNITION; PARAMETER;
D O I
10.1007/978-3-030-22750-0_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the most crucial steps of preprocessing of document images subjected to further text recognition is their binarization, which influences significantly obtained OCR results. Since for degrades images, particularly historical documents, classical global and local thresholding methods may be inappropriate, a challenging task of their binarization is still up-to-date. In the paper a novel approach to the use of Generalized Gaussian Distribution for this purpose is presented. Assuming the presence of distortions, which may be modelled using the Gaussian noise distribution, in historical document images, a significant similarity of their histograms to those obtained for binary images corrupted by Gaussian noise may be observed. Therefore, extracting the parameters of Generalized Gaussian Distribution, distortions may be modelled and removed, enhancing the quality of input data for further thresholding and text recognition. Due to relatively long processing time, its shortening using the Monte Carlo method is proposed as well. The presented algorithm has been verified using well-known DIBCO datasets leading to very promising binarization results.
引用
收藏
页码:177 / 190
页数:14
相关论文
共 29 条
[1]  
Bradley Derek, 2007, Journal of Graphics Tools, V12, P13
[2]  
Clarke R.J., 1985, TRANSFORM CODING IMA
[3]   Adaptive binarization method for document image analysis [J].
Feng, ML ;
Tan, YP .
2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, :339-342
[4]  
Krupinski Robert, 2016, Image Processing & Communications, V21, P5, DOI 10.1515/ipc-2016-0019
[6]   Modeling Quantized Coefficients with Generalized Gaussian Distribution with Exponent 1/m, m=2, 3, ... [J].
Krupinski, Robert .
MAN-MACHINE INTERACTIONS 5, ICMMI 2017, 2018, 659 :228-237
[7]   Generating Augmented Quaternion Rand m Variable With Generalized Gaussian Distribution [J].
Krupinski, Robert .
IEEE ACCESS, 2018, 6 :34608-34615
[8]  
Lavu S, 2003, IEEE DATA COMPR CONF, P362
[9]   Optimization of the Fast Image Binarization Method Based on the Monte Carlo Approach [J].
Lech, P. ;
Okarma, K. .
ELEKTRONIKA IR ELEKTROTECHNIKA, 2014, 20 (04) :63-66
[10]   Distance-reciprocal distortion measure for binary document images [J].
Lu, HP ;
Kot, AC ;
Shi, YQ .
IEEE SIGNAL PROCESSING LETTERS, 2004, 11 (02) :228-231