A comparison of sequential and combined approaches for named entity recognition in a corpus of handwritten medieval charters

被引:15
作者
Boros, Emanuela [1 ]
Romero, Veronica [2 ]
Maarand, Martin [1 ]
Zenklova, Katerina [3 ]
Kreckova, Jitka [3 ]
Vidal, Enrique [2 ]
Stutzmann, Dominique [4 ]
Kermorvant, Christopher [1 ]
机构
[1] TEKLIA, Paris, France
[2] Univ Politecn Valencia, PRHLT Res Ctr, Valencia, Spain
[3] Narodni Arch, Prague, Czech Republic
[4] IRHT CNRS, Paris, France
来源
2020 17TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2020) | 2020年
关键词
Named entity recognition; Handwritten Text Recognition; historical document processing; multilingualism;
D O I
10.1109/ICFHR2020.2020.00025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a new corpus of multilingual medieval handwritten charter images, annotated with full transcription and named entities. The corpus is used to compare two approaches for named entity recognition in historical document images in several languages: on the one hand, a sequential approach, more commonly used, that sequentially applies handwritten text recognition (HTR) and named entity recognition (NER), on the other hand, a combined approach that simultaneously transcribes the image text line and extracts the entities. Experiments conducted on the charter corpus in Latin, early new high German and old Czech for name, date and location recognition demonstrate a superior performance of the combined approach.
引用
收藏
页码:79 / 84
页数:6
相关论文
共 50 条
  • [31] A French Corpus and Annotation Schema for Named Entity Recognition and Relation Extraction of Financial News
    Jabbari, Ali
    Sauvage, Olivier
    Zeine, Hamada
    Chergui, Hamza
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2293 - 2299
  • [32] M-CNER: A Corpus for Chinese Named Entity Recognition in Multi-Domains
    Lu, Qi
    Yang, YaoSheng
    Li, Zhenghua
    Chen, Wenliang
    Zhang, Min
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4457 - 4461
  • [33] A Corpus Study and Annotation Schema for Named Entity Recognition and Relation Extraction of Business Products
    Schoen, Saskia
    Mironova, Veselina
    Gabryszak, Aleksandra
    Hennig, Leonhard
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4445 - 4451
  • [34] Named Entity Recognition for Code-Mixed Indian Corpus using Meta Embedding
    Priyadharshini, Ruba
    Chakravarthi, Bharathi Raja
    Vegupatti, Mani
    McCrae, John P.
    [J]. 2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 68 - 72
  • [35] A Comparative Study of Named Entity Recognition for Arabic Using Ensemble Learning Approaches
    El bazi, Ismail
    Laachfoubi, Nabil
    [J]. 2015 IEEE/ACS 12TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2015,
  • [36] Performance Analysis of Named Entity Recognition Approaches on Code-Mixed Data
    Gaddamidi, Sreeja
    Prasath, Rajendra
    [J]. INFORMATION, COMMUNICATION AND COMPUTING TECHNOLOGY (ICICCT 2021), 2021, 1417 : 153 - 167
  • [37] A comparative study on feature reduction approaches in Hindi and Bengali named entity recognition
    Saha, Sujan Kumar
    Mitra, Pabitra
    Sarkar, Sudeshna
    [J]. KNOWLEDGE-BASED SYSTEMS, 2012, 27 : 322 - 332
  • [38] Combining Neural and Knowledge-Based Approaches to Named Entity Recognition in Polish
    Dadas, Slawomir
    [J]. ARTIFICIAL INTELLIGENCEAND SOFT COMPUTING, PT I, 2019, 11508 : 39 - 50
  • [39] Comparison of Named Entity Recognition models based on Neural Network in Biomedical
    Kishwar, Azka
    Batool, Komal
    [J]. PROCEEDINGS OF 2021 INTERNATIONAL BHURBAN CONFERENCE ON APPLIED SCIENCES AND TECHNOLOGIES (IBCAST), 2021, : 426 - 431
  • [40] Combined Attention Mechanism for Named Entity Recognition in Chinese Electronic Medical Records
    Li, Luqi
    Hou, Li
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2019, : 476 - 477