Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images

被引:21
作者
Akbari, Younes [1 ]
Al-Maadeed, Somaya [1 ]
Adam, Kalthoum [1 ]
机构
[1] Qatar Univ, Dept Comp Sci & Engn, Doha, Qatar
关键词
Document image binarization; wavelet-based multichannel images; single and multiple CNNs; SegNet; U-net; DeepLabv3+; SEGMENTATION; COMPETITION; ALGORITHMS; ENERGY;
D O I
10.1109/ACCESS.2020.3017783
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichannel images. To create the images, the original source image is decomposed into wavelet subbands. Then, the original image is approximated by each subband separately, and finally, the multichannel image is constituted by arranging the original source image (grayscale image) as the first channel and the approximated image by each subband as the remaining channels. To achieve the best results, two scenarios are considered, that is, two-channel and four-channel images, and then fed into two types of CNN architectures, namely, single and multiple streams. To investigate the effect of the multichannel images proposed as network inputs, the CNNs used in the architectures are three popular networks, namely, U-net, SegNet, and DeepLabv3+. The experimental results of the scenarios demonstrate that our method is more successful than the three CNNs when trained by the original source images and proves competitive performance in comparison with state-of-the-art results using the DIBCO database.
引用
收藏
页码:153517 / 153534
页数:18
相关论文
共 68 条
[1]  
Akbari Younes, 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR). Proceedings, P973, DOI 10.1109/ICDAR.2019.00160
[2]   A novel database for automatic processing of Persian handwritten bank checks [J].
Akbari, Younes ;
Jalili, Mohammad J. ;
Sadri, Javad ;
Nouri, Kazem ;
Siddiqi, Imran ;
Djeddi, Chawki .
PATTERN RECOGNITION, 2018, 74 :253-265
[3]   Wavelet-based gender detection on off-line handwritten documents using probabilistic finite state automata [J].
Akbari, Younes ;
Nouri, Kazem ;
Sadri, Javad ;
Djeddi, Chawki ;
Siddiqi, Imran .
IMAGE AND VISION COMPUTING, 2017, 59 :17-30
[4]  
[Anonymous], 2015, HIP 2015
[5]  
[Anonymous], 2017, ARXIV170803276
[6]   Image coding using wavelet transform [J].
Antonini, Marc ;
Barlaud, Michel ;
Mathieu, Pierre ;
Daubechies, Ingrid .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 1992, 1 (02) :205-220
[7]   Historical Document Binarization Combining Semantic Labeling and Graph Cuts [J].
Ayyalasomayajula, Kalyan Ram ;
Brun, Anders .
IMAGE ANALYSIS, SCIA 2017, PT I, 2017, 10269 :386-396
[8]   PDNet: Semantic segmentation integrated with a primal-dual network for document binarization [J].
Ayyalasomayajula, Kalyan Ram ;
Malmberg, Filip ;
Brun, Anders .
PATTERN RECOGNITION LETTERS, 2019, 121 :52-60
[9]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[10]   An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision [J].
Boykov, Y ;
Kolmogorov, V .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (09) :1124-1137