Discrete representation learning for handwritten text recognition

被引:2
作者
Davoudi, Homa [1 ]
Traviglia, Arianna [1 ]
机构
[1] Fdn Ist Italiano Tecnol IIT, Ctr Cultural Heritage Technol CCHT, Via Torino 155, I-30172 Venice, Italy
关键词
Handwritten text recognition; Deep learning; Representation learning; Discrete latent variables; LEXICON REDUCTION; NEURAL-NETWORK; SEQUENCE; MODELS;
D O I
10.1007/s00521-023-08445-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Handwritten text recognition, i.e., the conversion of scanned handwritten documents into machine-readable text, is a complex exercise due to the variability and complexity of handwriting. A common approach in handwritten text recognition consists of a feature extraction step followed by a recognizer. In this paper, we propose a novel DNN architecture for handwritten text recognition that extracts discrete representation from the input text-line image. The proposed model is constructed of an encoder-decoder network with an added quantization layer which applies a dictionary of representative vectors to discretize the latent variables. The dictionary and the network parameters are trained jointly through the k-means algorithm and back propagation, respectively. The performance of the suggested model is evaluated through conducting extensive experiments on five datasets, analyzing the effect of discrete representation on handwriting recognition. The results demonstrate that the use of feature discretization improves the performance of deep handwriting text recognition models when compared to the conventional DNN models with continuous representation. Specifically, the character error rate is decreased by 22% and 21:1% on IAM and ICFHR18 datasets, respectively.
引用
收藏
页码:15759 / 15773
页数:15
相关论文
共 49 条
  • [1] Attention-Based Fully Gated CNN-BGRU for Russian Handwritten Text
    Abdallah, Abdelrahman
    Hamada, Mohamed
    Nurseitov, Daniyar
    [J]. JOURNAL OF IMAGING, 2020, 6 (12)
  • [2] Aradillas J. C., 2021, IEEE Access
  • [3] Online handwritten shape recognition using segmental hidden Markov models
    Artieres, Thierry
    Marukatat, Sanparith
    Gallinari, Patrick
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (02) : 205 - 217
  • [4] Bluche T, 2016, ADV NEUR IN, V29
  • [5] Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention
    Bluche, Theodore
    Louradour, Jerome
    Messina, Ronaldo
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1050 - 1055
  • [6] Historical Handwritten Text Images Word Spotting Through Sliding Window HOG Features
    Bolelli, Federico
    Borghi, Guido
    Grana, Costantino
    [J]. IMAGE ANALYSIS AND PROCESSING,(ICIAP 2017), PT I, 2017, 10484 : 729 - 738
  • [7] High Performance Text Recognition using a Hybrid Convolutional-LSTM Implementation
    Breuel, Thomas M.
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 11 - 16
  • [8] Feature Set Evaluation for Offline Handwriting Recognition Systems: Application to the Recurrent Neural Network Model
    Chherawala, Youssouf
    Roy, Partha Pratim
    Cheriet, Mohamed
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (12) : 2825 - 2836
  • [9] Watch Your Strokes: Improving Handwritten Text Recognition with Deformable Convolutions
    Cojocaru, Iulian
    Cascianelli, Silvia
    Baraldi, Lorenzo
    Corsini, Massimiliano
    Cucchiara, Rita
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6096 - 6103
  • [10] Ancient Document Layout Analysis: Autoencoders meet Sparse Coding
    Davoudi, Homa
    Fiorucci, Marco
    Traviglia, Arianna
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5936 - 5942