Simultaneous Script Identification and Handwriting Recognition via Multi-Task Learning of Recurrent Neural Networks

被引:23
|
作者
Chen, Zhuo [1 ,2 ]
Wu, Yichao [1 ,2 ]
Yin, Pei [1 ]
Liu, Cheng-Lin [1 ,2 ]
机构
[1] Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, 95 Zhongguan East Rd, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
来源
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1 | 2017年
基金
中国国家自然科学基金;
关键词
multi-task learning; SepMDLSTM; script identification; language identification; handwritten text recognition;
D O I
10.1109/ICDAR.2017.92
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a method for simultaneous script identification and handwritten text line recognition in multi-task learning framework. Firstly, we use Separable Multi-Dimensional Long Short-Term Memory (SepMDLSTM) to encode the input text line images based on convolutional feature extraction. Then, the extracted features are fed into two classification modules for script identification and multi-script text recognition, respectively. All the network parameters are trained end-to-end by multi-task learning where the script identification task and the text recognition task are aimed to minimize the Negative Log Likelihood (NLL) loss and Connectionist Temporal Classification (CTC) loss, respectively. We evaluated the performance of the proposed method on handwritten text line datasets of three languages, namely, IAM (English), Rimes (French) and IFN/ENIT (Arabic). Experimental results demonstrate the multi-task learning framework performs superiorly for both script identification and text recognition. Particularly, the accuracy of script identification is higher than 99.9% and the character error rate (CER) of text recognition is even lower than that of some single-script text recognition systems.
引用
收藏
页码:525 / 530
页数:6
相关论文
共 50 条
  • [31] Identification of Distorted RF Components via Deep Multi-Task Learning
    Aygul, Mehmet Ali
    Memisoglu, Ebubekir
    Cirpan, Hakan Ali
    Arslan, Huseyin
    2022 IEEE 96TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2022-FALL), 2022,
  • [32] Speech Emotion Recognition with Multi-task Learning
    Cai, Xingyu
    Yuan, Jiahong
    Zheng, Renjie
    Huang, Liang
    Church, Kenneth
    INTERSPEECH 2021, 2021, : 4508 - 4512
  • [33] Discretized Continuous Speech Emotion Recognition with Multi-Task Deep Recurrent Neural Network
    Duc Le
    Aldeneh, Zakaria
    Provost, Emily Mower
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1108 - 1112
  • [34] Style-adaptive photo aesthetic rating via convolutional neural networks and multi-task learning
    Gao, Fei
    Li, Ziyun
    Yu, Jun
    Yu, Junze
    Huang, Qingming
    Tian, Qi
    NEUROCOMPUTING, 2020, 395 : 247 - 254
  • [35] Vickers Hardness Value Test via Multi-Task Learning Convolutional Neural Networks and Image Augmentation
    Cheng, Wan-Shu
    Chen, Guan-Ying
    Shih, Xin-Yen
    Elsisi, Mahmoud
    Tsai, Meng-Hsiu
    Dai, Hong-Jie
    APPLIED SCIENCES-BASEL, 2022, 12 (21):
  • [36] Multimodal Sentiment Recognition With Multi-Task Learning
    Zhang, Sun
    Yin, Chunyong
    Yin, Zhichao
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (01): : 200 - 209
  • [37] A neural network multi-task learning approach to biomedical named entity recognition
    Crichton, Gamal
    Pyysalo, Sampo
    Chiu, Billy
    Korhonen, Anna
    BMC BIOINFORMATICS, 2017, 18
  • [38] A Pseudo-task Design in Multi-task Learning Deep Neural Network for Speaker Recognition
    Lu, Xugang
    Shen, Peng
    Tsao, Yu
    Kawai, Hisashi
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [39] Deep Convolutional Neural Network with Multi-Task Learning Scheme for Modulations Recognition
    Mossad, Omar S.
    ElNainay, Mustafa
    Torki, Marwan
    2019 15TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2019, : 1644 - 1649
  • [40] A neural network multi-task learning approach to biomedical named entity recognition
    Gamal Crichton
    Sampo Pyysalo
    Billy Chiu
    Anna Korhonen
    BMC Bioinformatics, 18