A neural model for text localization, transcription and named entity recognition in full pages

被引:36
作者
Carbonell, Manuel [1 ,2 ]
Fornes, Alicia [1 ]
Villegas, Mauricio [2 ]
Llados, Josep [1 ]
机构
[1] Univ Autonoma Barcelona, Comp Vis Ctr, Dept Comp Sci, Barcelona, Spain
[2] Omni:us, Berlin, Germany
基金
欧盟地平线“2020”;
关键词
Document image analysis; Information extraction; Text detection; Handwritten text recognition; Named entity recognition; Deep neural networks; Multi-task learning;
D O I
10.1016/j.patrec.2020.05.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the last years, the consolidation of deep neural network architectures for information extraction in document images has brought big improvements in the performance of each of the tasks involved in this process, consisting of text localization, transcription, and named entity recognition. However, this process is traditionally performed with separate methods for each task. In this work we propose an end-to-end model that combines a one stage object detection network with branches for the recognition of text and named entities respectively in a way that shared features can be learned simultaneously from the training error of each of the tasks. By doing so the model jointly performs handwritten text detection, transcription, and named entity recognition at page level with a single feed forward step. We exhaustively evaluate our approach on different datasets, discussing its advantages and limitations compared to sequential approaches. The results show that the model is capable of benefiting from shared features by simultaneously solving interdependent tasks. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:219 / 227
页数:9
相关论文
共 27 条
[1]  
Akbik A., 2018, P 27 INT C COMPUTATI, P1638
[2]   ICFHR2014 Competition on Handwritten Text Recognition on tranScriptorium Datasets (HTRtS) [J].
Andreu Sanchez, Joan ;
Romero, Veronica ;
Toselli, Alejandro H. ;
Vidal, Enrique .
2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, :785-790
[3]  
[Anonymous], ARXIV160307285
[4]   Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention [J].
Bluche, Theodore ;
Louradour, Jerome ;
Messina, Ronaldo .
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, :1050-1055
[5]  
Bos J., 2017, HDB LINGUISTIC ANNOT, P463, DOI DOI 10.1007/978940240881218
[6]   End-to-End Handwritten Text Detection and Transcription in Full Pages [J].
Carbonell, Manuel ;
Mas, Joan ;
Villegas, Mauricio ;
Fornes, Alicia ;
Llados, Josep .
2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW), VOL 5, 2019, :29-34
[7]   Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model [J].
Carbonell, Manuel ;
Villegas, Mauricio ;
Fornes, Alicia ;
Llados, Josep .
2018 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS), 2018, :399-404
[8]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[9]  
Graves A., 2006, P 23 INT C MACH LEAR, P369
[10]  
Guo He, 2019, ICDAR