Non-stationary modeling for the separation of overlapped texts in documents

被引:0
作者
Tonazzini, Anna [1 ]
Savino, Pasquale [1 ]
Salerno, Emanuele [1 ]
机构
[1] CNR, Ist Sci & Tecnol Informaz, I-56124 Pisa, Italy
来源
2014 22ND SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2014年
关键词
Document restoration; non-stationary data model; back-to-front interferences;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we address the removal of severe back-to-front interferences in archival documents, when recto and verso images of the page are available. The problem is approached from a modeling point of view, considering the ideal images of the two separated texts as individual source patterns that overlap in the observed images through some parametric mixing operator. Earlier approaches were based on linear mixtures of the ideal reflectance maps, or of the ideal optical densities and absorptance maps, through unknown coefficients or blur kernels. Some approximations and/or partial user supervision were then adopted to jointly estimate the sources and the model parameters. Nevertheless, a feasible and reliable data model for this problem should at least be non-linear and space-variant, to cope with occlusions, ink saturation, and large variability of the mixing level. This is especially true for ancient documents affected by ink seeping (bleed-through). The search for such a model is still far from being concluded, or even impossible to pursue, due to the unavailability of information about the chemical and physical processes at the origin of the phenomenon. Hence, here, we propose the use of pixel-dependent parameters, within a model additive in the optical densities, to compensate not only for non-stationarity, but also for the lack or the imprecise knowledge of the non-linearity, and for modeling errors more in general.
引用
收藏
页码:2314 / 2318
页数:5
相关论文
共 16 条
[1]  
[Anonymous], 2012, BLEED THROUGH DAT
[2]  
Dubois E, 2001, PICS 2001: IMAGE PROCESSING, IMAGE QUALITY, IMAGE CAPTURE, SYSTEMS CONFERENCE, PROCEEDINGS, P177
[3]  
Gerace I, 2012, EUR SIGNAL PR CONF, P1588
[4]   Nonlinear model and constrained ML for removing back-to-front interferences from recto-verso documents [J].
Martinelli, Francesca ;
Salerno, Emanuele ;
Gerace, Ivan ;
Tonazzini, Anna .
PATTERN RECOGNITION, 2012, 45 (01) :596-605
[5]   Linear-quadratic blind source separating structure for removing show-through in scanned documents [J].
Merrikh-Bayat, Farnood ;
Babaie-Zadeh, Massoud ;
Jutten, Christian .
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2011, 14 (04) :319-333
[6]   Using Non-Negative Matrix Factorization for Removing Show-Through [J].
Merrikh-Bayat, Farnood ;
Babaie-Zadeh, Massoud ;
Jutten, Christian .
LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION, 2010, 6365 :482-+
[7]   A Variational Approach to Degraded Document Enhancement [J].
Moghaddam, Reza Farrahi ;
Cheriet, Mohamed .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (08) :1347-1361
[8]  
Ophir Boaz, 2007, Proceedings 2007 IEEE International Conference on Image Processing, ICIP 2007, P233
[9]   THRESHOLD SELECTION METHOD FROM GRAY-LEVEL HISTOGRAMS [J].
OTSU, N .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1979, 9 (01) :62-66
[10]  
Rowley-Brooke R., 2012, P SOC PHOTO-OPT INS, VXIX