Handwritten text recognition and information extraction from ancient manuscripts using deep convolutional and recurrent neural network

被引:0
作者
El Bahi, Hassan [1 ]
机构
[1] L2IS, Laboratory of Computer and Systems Engineering, Cadi Ayyad University, B.P. 511, Marrakech
关键词
Ancient manuscripts; Convolutional neural network; Handwritten text recognition; Named entity recognition; Recurrent neural network;
D O I
10.1007/s00500-024-09930-6
中图分类号
学科分类号
摘要
Digitizing ancient manuscripts and making them accessible to a broader audience is a crucial step in unlocking the wealth of information they hold. However, automatic recognition of handwritten text and the extraction of relevant information such as named entities from these manuscripts are among the most difficult research topics, due to several factors such as poor quality of manuscripts, complex background, presence of ink stains, cursive handwriting, etc. To meet these challenges, we propose two systems, the first system performs the task of handwritten text recognition (HTR) in ancient manuscripts; it starts with a preprocessing operation. Then, a convolutional neural network (CNN) is used to extract the features of each input image. Finally, a recurrent neural network (RNN) which has Long Short-Term Memory (LSTM) blocks with the Connectionist Temporal Classification (CTC) layer will predict the text contained in the image. The second system focuses on recognizing named entities and deciphering the relationships among words directly from images of old manuscripts, bypassing the need for an intermediate text transcription step. Like the previous system, this second system starts with a preprocessing step. Then the data augmentation technique is used to increase the training dataset. After that, the extraction of the most relevant features is done automatically using a CNN model. Finally, the recognition of names entities and the relationship between word images is performed using a bidirectional LSTM. Extensive experiments on the ESPOSALLES dataset demonstrate that the proposed systems achieve the state-of-the-art performance exceeding existing systems. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.
引用
收藏
页码:12249 / 12268
页数:19
相关论文
共 50 条
  • [1] Scene text recognition using residual convolutional recurrent neural network
    Lei, Zhengchao
    Zhao, Sanyuan
    Song, Hongmei
    Shen, Jianbing
    MACHINE VISION AND APPLICATIONS, 2018, 29 (05) : 861 - 871
  • [2] Scene text recognition using residual convolutional recurrent neural network
    Zhengchao Lei
    Sanyuan Zhao
    Hongmei Song
    Jianbing Shen
    Machine Vision and Applications, 2018, 29 : 861 - 871
  • [3] Attention Augmented Convolutional Recurrent Network for Handwritten Japanese Text Recognition
    Ly, Nam Tuan
    Nguyen, Cuong Tuan
    Nakagawa, Masaki
    2020 17TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2020), 2020, : 163 - 168
  • [4] Text recognition in document images obtained by a smartphone based on deep convolutional and recurrent neural network
    El Bahi, Hassan
    Zatni, Abdelkarim
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (18) : 26453 - 26481
  • [5] Text recognition in document images obtained by a smartphone based on deep convolutional and recurrent neural network
    Hassan El Bahi
    Abdelkarim Zatni
    Multimedia Tools and Applications, 2019, 78 : 26453 - 26481
  • [6] Bangla Handwritten Basic Character Recognition Using Deep Convolutional Neural Network
    Saha, Chandrika
    Faisal, Rahat Hossain
    Rahman, Md Mostafijur
    2019 JOINT 8TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2019 3RD INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR) WITH INTERNATIONAL CONFERENCE ON ACTIVITY AND BEHAVIOR COMPUTING (ABC), 2019, : 190 - 195
  • [7] On the improvement of handwritten text line recognition with octave convolutional recurrent neural networks
    Castro, Dayvid
    Zanchettin, Cleber
    Amaral, Luis A. Nunes
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2024, 27 (04) : 567 - 581
  • [8] Recognition and Solution for Handwritten Equation Using Convolutional Neural Network
    Hossain, Md Bipul
    Naznin, Feroza
    Joarder, Y. A.
    Islam, Md Zahidul
    Uddin, Md Jashim
    2018 JOINT 7TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2018 2ND INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2018, : 250 - 255
  • [9] Handwritten Tamil Character Recognition using Convolutional Neural Network
    Gnanasivam, P.
    Bharath, G.
    Karthikeyan, V
    Dhivya, V
    2021 SIXTH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2021, : 84 - 88
  • [10] Malayalam Handwritten Character Recognition Using Convolutional Neural Network
    Nair, Pranav P.
    James, Ajay
    Saravanan, C.
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2017, : 278 - 281