ICDAR 2024 Competition on Handwritten Text Recognition in Brazilian Essays - BRESSAY

被引:0
作者
Neto, Arthur F. S. [1 ]
Bezerra, Byron L. D. [1 ]
Araujo, Savio S. [1 ]
Souza, Wiliane M. A. S. [1 ]
Alves, Kleberson F. [2 ]
Oliveira, Macileide F. [3 ]
Lins, Samara V. S. [1 ]
Hazin, Hugo J. F. [1 ]
Rocha, Pedro H., V [1 ]
Toselli, Alejandro H. [4 ]
机构
[1] Univ Pernambuco, Recife, PE, Brazil
[2] Univ Fed Agreste Pernambuco, Garanhuns, Brazil
[3] Univ Fed Vale Sao Francisco, Petrolina, Brazil
[4] Univ Politecn Valencia, Valencia, Spain
来源
DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT VI | 2024年 / 14809卷
关键词
dataset; brazilian portuguese essays; computer vision; deep learning; handwritten text recognition;
D O I
10.1007/978-3-031-70552-6_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the "Handwritten Text Recognition in Brazilian Essays - BRESSAY" competition, held at the 18th International Conference on Document Analysis and Recognition (ICDAR 2024). The competition aimed to advance Handwritten Text Recognition (HTR) by addressing challenges specific to Brazilian Portuguese academic essays, such as diverse handwriting styles and document irregularities like smudges and erasures. Participants were encouraged to develop robust algorithms capable of accurately transcribing handwritten texts at line, paragraph, and page levels using the new BRESSAY dataset. The competition attracted 14 participants from different countries, with 4 research groups submitting a total of 11 proposals in the three challenges by the end of the competition. These proposals achieved impressive recognition rates and demonstrated advancements over traditional baseline models by using key strategies such as preprocessing techniques, synthetic data approaches, and advanced deep learning models. The evaluation metrics used were Character Error Rate (CER) and Word Error Rate (WER), with error rates reaching up to 2.88% CER and 9.39% WER for line-level recognition, 3.75% CER and 10.48% WER for paragraph-level recognition, and 3.77% CER and 10.08% WER for page-level recognition. The competition highlight the potential for continued improvements in HTR and underscore the BRESSAY dataset as a resource for future researches. The dataset is available in the repository (https://github.com/arthurflor23/handwritten-text-recognition).
引用
收藏
页码:345 / 362
页数:18
相关论文
共 30 条
  • [1] Sanchez JA, 2016, INT CONF FRONT HAND, P630, DOI [10.1109/ICFHR.2016.112, 10.1109/ICFHR.2016.0120]
  • [2] Bezerra B., 2017, Handwriting: Recognition, Development and Analysis. Computer science, technology and applications
  • [3] Gated Convolutional Recurrent Neural Networks for Multilingual Handwriting Recognition
    Bluche, Theodore
    Messina, Ronaldo
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 646 - 651
  • [4] Buddha H., 2020, Mukt Shabd J., V9, P3595
  • [5] Chen XY, 2023, Arxiv, DOI [arXiv:2309.05239, DOI 10.48550/ARXIV.2309.05239, 10.48550/arXiv.2309.05239]
  • [6] Constum T., 2024, working paper or preprint
  • [7] Coquenet Denis, 2023, Document Analysis and Recognition - ICDAR 2023: 17th International Conference, Proceedings. Lecture Notes in Computer Science (14190), P182, DOI 10.1007/978-3-031-41685-9_12
  • [8] DAN: A Segmentation-Free Document Attention Network for Handwritten Document Recognition
    Coquenet, Denis
    Chatelain, Clement
    Paquet, Thierry
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (07) : 8227 - 8243
  • [9] Darmatasia, 2017, 2017 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOIC7), DOI 10.1109/ICoICT.2017.8074699
  • [10] HDSR-Flor: A Robust End-to-End System to Solve the Handwritten Digit String Recognition Problem in Real Complex Scenarios
    De Sousa Neto, Arthur Flor
    Bezerra, Byron Leite Dantas
    Lima, Estanislau Baptista
    Toselli, Alejandro Hector
    [J]. IEEE ACCESS, 2020, 8 (08) : 208543 - 208553