Degraded document image binarization using structural symmetry of strokes

被引:64
作者
Jia, Fuxi [1 ]
Shi, Cunzhao [1 ]
He, Kun [1 ]
Wang, Chunheng [1 ]
Xiao, Baihua [1 ]
机构
[1] Univ Chinese Acad Sci, Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, 95 Zhongguancun East Rd, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Document image binarization; Structural symmetry of strokes; Local threshold; Stroke width estimation;
D O I
10.1016/j.patcog.2017.09.032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an effective approach for the local threshold binarization of degraded document images. We utilize the structural symmetric pixels (SSPs) to calculate the local threshold in neighborhood and the voting result of multiple thresholds will determine whether one pixel belongs to the foreground or not. The SSPs are defined as the pixels around strokes whose gradient magnitudes are large enough and orientations are symmetric opposite. The compensated gradient map is used to extract the SSP so as to weaken the influence of document degradations. To extract SSP candidates with large magnitudes and distinguish the faint characters and bleed-through background, we propose an adaptive global threshold selection algorithm. To further extract pixels with opposite orientations, an iterative stroke width estimation algorithm is applied to ensure the proper size of neighborhood used in orientation judgement. At last, we present a multiple threshold vote based framework to deal with some inaccurate detections of SSP. The experimental results on seven public document image binarization datasets show that our method is accurate and robust compared with many traditional and state-of-the-art document binarization approaches based on multiple evaluation measures. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:225 / 240
页数:16
相关论文
共 36 条
[1]  
[Anonymous], [No title captured]
[2]  
[Anonymous], 1985, INTRO DIGITAL IMAGE
[3]  
Bersen J., 1986, Eighth International Conference on Pattern Recognition. Proceedings (Cat. No.86CH2342-4), P1251
[4]   Parallel nonparametric binarization for degraded document images [J].
Chen, Xin ;
Lin, Liang ;
Gao, Yuefang .
NEUROCOMPUTING, 2016, 189 :43-52
[5]   A binarization method with learning-built rules for document images produced by cameras [J].
Chou, Chien-Hsing ;
Lin, Wen-Hsiung ;
Chang, Fu .
PATTERN RECOGNITION, 2010, 43 (04) :1518-1530
[6]  
Cunzhao Shi, 2012, Proceedings of the 10th IAPR International Workshop on Document Analysis Systems (DAS 2012), P58, DOI 10.1109/DAS.2012.15
[7]  
Gatos Basilis, 2009, 2009 10th International Conference on Document Analysis and Recognition (ICDAR), P1375, DOI 10.1109/ICDAR.2009.246
[8]  
Gatos B, 2008, INT C PATT RECOG, P1909
[9]   A new efficient binarization method: application to degraded historical document images [J].
Hadjadj, Zineb ;
Cheriet, Mohamed ;
Meziane, Abdelkrim ;
Cherfa, Yazid .
SIGNAL IMAGE AND VIDEO PROCESSING, 2017, 11 (06) :1155-1162
[10]   Document binarization with automatic parameter tuning [J].
Howe, Nicholas R. .
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2013, 16 (03) :247-258