Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks

被引:30
作者
Boillet, Melodie [1 ,2 ]
Kermorvant, Christopher [1 ,2 ]
Paquet, Thierry [1 ]
机构
[1] Univ Rouen Normandie, LITIS, Rouen, France
[2] TEKLIA, Paris, France
来源
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2021年
关键词
Document Layout Analysis; Historical document; Fully Convolutional Network; Deep Learning;
D O I
10.1109/ICPR48806.2021.9412447
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we introduce a fully convolutional network for the document layout analysis task. While stateof-the-art methods are using models pre-trained on natural scene images, our method Doc-UFCN relies on a U-shaped model trained from scratch for detecting objects from historical documents. We consider the line segmentation task and more generally the layout analysis problem as a pixel-wise classification task then our model outputs a pixel-labeling of the input images. We show that Doc-UFCN outperforms state-of-the-art methods on various datasets and also demonstrate that the pre-trained parts on natural scene images are not required to reach good results. In addition, we show that pre-training on multiple document datasets can improve the performances. We evaluate the models using various metrics to have a fair and complete comparison between the methods.
引用
收藏
页码:2134 / 2141
页数:8
相关论文
共 20 条
[1]   Open Evaluation Tool for Layout Analysis of Document Images [J].
Alberti, Michele ;
Bouillon, Manuel ;
Ingold, Rolf ;
Liwicki, Marcus .
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2017), VOL 4, 2017, :43-47
[2]  
[Anonymous], 2015, Tiny ImageNet Visual Recognition Challenge., DOI DOI 10.1109/ICCV.2015.123
[3]   Historical Document Layout Analysis Competition [J].
Antonacopoulos, A. ;
Clausner, C. ;
Papadopoulos, C. ;
Pletschacher, S. .
11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, :1516-1520
[4]   Text Line Segmentation for Challenging Handwritten Document Images Using Fully Convolutional Network [J].
Barakat, Berat ;
Droby, Ahmad ;
Kassis, Majeed ;
El-Sana, Jihad .
PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, :374-379
[5]  
Bernal E.A., 2017, P IEEE C COMP VIS PA, P5447
[6]   HORAE: an annotated dataset of books of hours [J].
Boillet, Melodie ;
Bonhomme, Marie-Laurence ;
Stutzmann, Dominique ;
Kermorvant, Christopher .
PROCEEDINGS OF THE 2019 WORKSHOP ON HISTORICAL DOCUMENT IMAGING AND PROCESSING (HIP' 19), 2019, :7-12
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]  
Diem Markus, 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR). Proceedings, P1494, DOI 10.1109/ICDAR.2019.00240
[9]   cBAD: ICDAR2017 Competition on Baseline Detection [J].
Diem, Markus ;
Kleber, Florian ;
Fiel, Stefan ;
Gatos, Basilis ;
Gruening, Tobias .
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, :1355-1360
[10]  
Gruning T., 2017, CORR