Multi-task Learning for Newspaper Image Segmentation and Baseline Detection Using Attention-Based U-Net Architecture

被引:3
作者
Bansal, Anukriti [1 ]
Mukherjee, Prerana [2 ]
Joshi, Divyansh [2 ]
Tripathi, Devashish [2 ]
Singh, Arun Pratap [2 ]
机构
[1] LNM Inst Informat Technol, Jaipur, Rajasthan, India
[2] Jawaharlal Nehru Univ, Sch Engn, Delhi, India
来源
DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT II | 2021年 / 12917卷
关键词
Multi-task learning; Newspaper document images; Attention; Text block segmentation; Text baseline detection;
D O I
10.1007/978-3-030-86159-9_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose an end-to-end language agnostic multi-task learning based U-Net framework for performing text block segmentation and baseline detection in document images. We leverage the performance of U-Net by augmenting attention layers between the contracting and expansive path via skip connections. The generalization ability of the model is validated on handwritten images as well. We perform exhaustive experiments on ICPR2020 challenge dataset and obtain a test accuracy of 96.09% and 99.44% for simple track baseline detection and text block segmentation respectively, 97.47% and 98.51% complex track baseline and text block segmentation respectively. The source code is made publicly available at https://github.com/divyanshjoshi/Attention-U-Net-Newspaper-Text-Block-Segmentation.
引用
收藏
页码:440 / 454
页数:15
相关论文
共 27 条
[1]  
Abadi M, 2016, ACM SIGPLAN NOTICES, V51, P1, DOI [10.1145/2951913.2976746, 10.1145/3022670.2976746]
[2]   A document straight line based segmentation for complex layout extraction [J].
Alheritiere, Heloise ;
Cloppet, Florence ;
Kurtz, Camille ;
Ogier, Jean-Marc ;
Vincent, Nicole .
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, :1126-1131
[3]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[4]   Newspaper Article Extraction Using Hierarchical Fixed Point Model [J].
Bansal, Anukriti ;
Chaudhury, Santanu ;
Roy, Sumantra Dutta ;
Srivastava, J. B. .
2014 11TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS 2014), 2014, :257-261
[5]   Performance Evaluation of Algorithms for Newspaper Article Identification [J].
Beretta, Roberto ;
Laura, Luigi .
11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, :394-398
[6]  
Chollet F., 2015, KERAS 20 COMPUTER SO
[7]  
Clausner Christian, 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR). Proceedings, P1527, DOI 10.1109/ICDAR.2019.00246
[8]  
Coquenet D., ARXIV PREPRINT ARXIV
[9]   cBAD: ICDAR2017 Competition on Baseline Detection [J].
Diem, Markus ;
Kleber, Florian ;
Fiel, Stefan ;
Gatos, Basilis ;
Gruening, Tobias .
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, :1355-1360
[10]  
Gulli A., 2017, Deep Learning with Keras, DOI DOI 10.1109/ICCV.2017.322