A hybrid CNN-Transformer model for Historical Document Image Binarization

被引:3
作者
Rezanezhad, Vahid [1 ]
Baierer, Konstantin [1 ]
Neudecker, Clemens [1 ]
机构
[1] Staatsbibliothek Berlin Preuss Kulturbesitz, Berlin, Germany
来源
PROCEEDINGS OF THE 2023 INTERNATIONAL WORKSHOP ON HISTORICAL DOCUMENT IMAGING AND PROCESSING, HIP 2023 | 2023年
关键词
Binarization; Document processing; Image enhancement; OCR; Deep learning; Transformers;
D O I
10.1145/3604951.3605508
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document image binarization is one of the main preprocessing steps in document image analysis for text recognition. Noise, faint characters, bad scanning conditions, uneven lighting or paper aging can cause artifacts that negatively impact text recognition algorithms. The task of binarization is to segment the foreground (text) from these degradations in order to improve optical character recognition (OCR) results. Convolutional Neural Networks (CNNs) are one popular method for binarization. But they suffer from focusing on the local context in a document image. We have applied a hybrid CNN-Transformer model to convert a document image into a binary output. For the model training, we used datasets from the Document Image Binarization Contests (DIBCO). For the datasets DIBCO-2012, DIBCO-2017 and DIBCO-2018, our model outperforms the state-of-the-art algorithms.
引用
收藏
页码:79 / 84
页数:6
相关论文
共 37 条
[1]   Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images [J].
Akbari, Younes ;
Al-Maadeed, Somaya ;
Adam, Kalthoum .
IEEE ACCESS, 2020, 8 :153517-153534
[2]  
[Anonymous], 2015, HIP 2015
[3]  
Ayatollahi SM, 2013, 2013 FIRST IRANIAN CONFERENCE ON PATTERN RECOGNITION AND IMAGE ANALYSIS (PRIA)
[4]  
Bhunia AK, 2019, IEEE IMAGE PROC, P2721, DOI [10.1109/ICIP.2019.8803348, 10.1109/icip.2019.8803348]
[5]  
Biten A.F., 2021, LaTr: Layout-Aware Transformer for Scene-Text VQA, P16548
[6]  
Burie JC, 2016, INT CONF FRONT HAND, P596, DOI [10.1109/ICFHR.2016.107, 10.1109/ICFHR.2016.0114]
[7]   A selectional auto-encoder approach for document image binarization [J].
Calvo-Zaragoza, Jorge ;
Gallego, Antonio-Javier .
PATTERN RECOGNITION, 2019, 86 :37-47
[8]  
Carion N, 2020, ARXIV
[9]   Document Image Binarization With Stroke Boundary Feature Guided Network [J].
Dang, Quang-Vinh ;
Lee, Guee-Sang .
IEEE ACCESS, 2021, 9 :36924-36936
[10]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848