End-to-End Handwritten Text Detection and Transcription in Full Pages

被引:15
作者
Carbonell, Manuel [1 ]
Mas, Joan [2 ]
Villegas, Mauricio [1 ]
Fornes, Alicia [2 ]
Llados, Josep [2 ]
机构
[1] Omni Us, Berlin, Germany
[2] Comp Vis Ctr, Barcelona, Spain
来源
2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW), VOL 5 | 2019年
关键词
Handwritten Text Recognition; Layout Analysis; Text segmentation; Deep Neural Networks; Multi-task learning;
D O I
10.1109/ICDARW.2019.40077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When transcribing handwritten document images, inaccuracies in the text segmentation step often cause errors in the subsequent transcription step. For this reason, some recent methods propose to perform the recognition at paragraph level. But still, errors in the segmentation of paragraphs can affect the transcription performance. In this work, we propose an end-to-end framework to transcribe full pages. The joint text detection and transcription allows to remove the layout analysis requirement at test time. The experimental results show that our approach can achieve comparable results to models that assume segmented paragraphs, and suggest that joining the two tasks brings an improvement over doing the two tasks separately.
引用
收藏
页码:29 / 34
页数:6
相关论文
共 22 条
[1]   Manuscript Text Line Detection and Segmentation using Second-Order Derivatives [J].
Aldavert, David ;
Rusinol, Marcal .
2018 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS), 2018, :293-298
[2]  
[Anonymous], EUR C COMP VIS ECCV
[3]  
[Anonymous], 2015, CORR
[4]  
[Anonymous], 2016, CORR
[5]  
[Anonymous], 2015, Deep Residual Learning for Image Recognition
[6]  
[Anonymous], 2018, ECCV
[7]  
[Anonymous], 2018, CORR
[8]   Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention [J].
Bluche, Theodore ;
Louradour, Jerome ;
Messina, Ronaldo .
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, :1050-1055
[9]  
Bluche Theodore, 2016, CORR
[10]   Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework [J].
Busta, Michal ;
Neumann, Lukas ;
Matas, Jiri .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2223-2231