End-to-End Handwritten Text Detection and Transcription in Full Pages

被引：15

作者：

Carbonell, Manuel ^{[1
]}

Mas, Joan ^{[2
]}

Villegas, Mauricio ^{[1
]}

Fornes, Alicia ^{[2
]}

Llados, Josep ^{[2
]}

机构：

[1] Omni Us, Berlin, Germany

[2] Comp Vis Ctr, Barcelona, Spain

来源：

2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW), VOL 5 | 2019年

关键词：

Handwritten Text Recognition; Layout Analysis; Text segmentation; Deep Neural Networks; Multi-task learning;

D O I：

10.1109/ICDARW.2019.40077

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

When transcribing handwritten document images, inaccuracies in the text segmentation step often cause errors in the subsequent transcription step. For this reason, some recent methods propose to perform the recognition at paragraph level. But still, errors in the segmentation of paragraphs can affect the transcription performance. In this work, we propose an end-to-end framework to transcribe full pages. The joint text detection and transcription allows to remove the layout analysis requirement at test time. The experimental results show that our approach can achieve comparable results to models that assume segmented paragraphs, and suggest that joining the two tasks brings an improvement over doing the two tasks separately.

引用

页码：29 / 34

页数：6

共 22 条

[1] Manuscript Text Line Detection and Segmentation using Second-Order Derivatives [J].

Aldavert, David ;

Rusinol, Marcal .

2018 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS), 2018, :293-298

[2]

[Anonymous], EUR C COMP VIS ECCV

[3]

[Anonymous], 2015, CORR

[4]

[Anonymous], 2016, CORR

[5]

[Anonymous], 2015, Deep Residual Learning for Image Recognition

[6]

[Anonymous], 2018, ECCV

[7]

[Anonymous], 2018, CORR

[8] Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention [J].

Bluche, Theodore ;

Louradour, Jerome ;

Messina, Ronaldo .

2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, :1050-1055

[9]

Bluche Theodore, 2016, CORR

[10] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework [J].

Busta, Michal ;

Neumann, Lukas ;

Matas, Jiri .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2223-2231

← 1 2 3 →