Simultaneous Script Identification and Handwriting Recognition via Multi-Task Learning of Recurrent Neural Networks

被引:23
|
作者
Chen, Zhuo [1 ,2 ]
Wu, Yichao [1 ,2 ]
Yin, Pei [1 ]
Liu, Cheng-Lin [1 ,2 ]
机构
[1] Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, 95 Zhongguan East Rd, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
来源
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1 | 2017年
基金
中国国家自然科学基金;
关键词
multi-task learning; SepMDLSTM; script identification; language identification; handwritten text recognition;
D O I
10.1109/ICDAR.2017.92
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a method for simultaneous script identification and handwritten text line recognition in multi-task learning framework. Firstly, we use Separable Multi-Dimensional Long Short-Term Memory (SepMDLSTM) to encode the input text line images based on convolutional feature extraction. Then, the extracted features are fed into two classification modules for script identification and multi-script text recognition, respectively. All the network parameters are trained end-to-end by multi-task learning where the script identification task and the text recognition task are aimed to minimize the Negative Log Likelihood (NLL) loss and Connectionist Temporal Classification (CTC) loss, respectively. We evaluated the performance of the proposed method on handwritten text line datasets of three languages, namely, IAM (English), Rimes (French) and IFN/ENIT (Arabic). Experimental results demonstrate the multi-task learning framework performs superiorly for both script identification and text recognition. Particularly, the accuracy of script identification is higher than 99.9% and the character error rate (CER) of text recognition is even lower than that of some single-script text recognition systems.
引用
收藏
页码:525 / 530
页数:6
相关论文
共 50 条
  • [41] Captcha Recognition based on Multi-task Convolutional Neural Network and Active Learning
    Qiu, Jucheng
    Wu, Xiaoyu
    2021 IEEE FOURTH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE 2021), 2021, : 108 - 112
  • [42] Multi-Task Self-Supervised Learning for Script Event Prediction
    Zhou, Bo
    Chen, Yubo
    Liu, Kang
    Zhao, Jun
    Xu, Jiexin
    Jiang, Xiaojian
    Li, Jinlong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3662 - 3666
  • [43] AMRNN: attended multi-task recurrent neural networks for dynamic illness severity prediction
    Weitong Chen
    Guodong Long
    Lina Yao
    Quan Z. Sheng
    World Wide Web, 2020, 23 : 2753 - 2770
  • [44] AMRNN: attended multi-task recurrent neural networks for dynamic illness severity prediction
    Chen, Weitong
    Long, Guodong
    Yao, Lina
    Sheng, Quan Z.
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (05): : 2753 - 2770
  • [45] Speech Emotion Recognition using Decomposed Speech via Multi-task Learning
    Hsu, Jia-Hao
    Wu, Chung-Hsien
    Wei, Yu-Hung
    INTERSPEECH 2023, 2023, : 4553 - 4557
  • [46] End-to-end Japanese Multi-dialect Speech Recognition and Dialect Identification with Multi-task Learning
    Imaizumi, Ryo
    Masumura, Ryo
    Shiota, Sayaka
    Kiya, Hitoshi
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2022, 11 (01)
  • [47] Multi-Task CTC for Joint Handwriting Recognition and Character Bounding Box Prediction
    Wigington, Curtis
    PROCEEDINGS OF THE 2023 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, DOCENG 2023, 2023,
  • [48] Multi-Task Learning in Deep Neural Networks for Mandarin-English Code-Mixing Speech Recognition
    Chen, Mengzhe
    Pan, Jielin
    Zhao, Qingwei
    Yan, Yonghong
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (10): : 2554 - 2557
  • [49] Using Multi-task Learning to Improve Diagnostic Performance of Convolutional Neural Networks
    Fang, Mengjie
    Dong, Di
    Sun, Ruijia
    Fan, Li
    Sun, Yingshi
    Liu, Shiyuan
    Tian, Jie
    MEDICAL IMAGING 2019: COMPUTER-AIDED DIAGNOSIS, 2019, 10950
  • [50] Offensive language identification with multi-task learning
    Marcos Zampieri
    Tharindu Ranasinghe
    Diptanu Sarkar
    Alex Ororbia
    Journal of Intelligent Information Systems, 2023, 60 : 613 - 630