RTNet: An End-to-End Method for Handwritten Text Image Translation

被引:4
|
作者
Su, Tonghua [1 ]
Liu, Shuchen [1 ]
Zhou, Shengjie [1 ]
机构
[1] Harbin Inst Technol, Sch Software, Harbin, Peoples R China
来源
DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II | 2021年 / 12822卷
基金
中国国家自然科学基金;
关键词
Machine translation; Text recognition; Image text translation; Handwritten text; End-to-End;
D O I
10.1007/978-3-030-86331-9_7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text image recognition and translation have a wide range of applications. It is straightforward to work out a two-stage approach: first perform the text recognition, then translate the text to target language. The handwritten text recognition model and the machine translation model are trained separately. Any transcription error may degrade the translation quality. This paper proposes an end-to-end leaning architecture that directly translates English handwritten text in images into Chinese. The handwriting recognition task and translation task are combined in a unified deep learning model. Firstly we conduct a visual encoding, next bridge the semantic gaps using a feature transformer and finally present a textual decoder to generate the target sentence. To train the model effectively, we use transfer learning to improve the generalization of the model under low-resource conditions. The experiments are carried out to compare our method to the traditional two-stage one. The results indicate that the performance of end-to-end model greatly improved as the amount of training data increases. Furthermore, when larger amount of training data is available, the end-to-end model is more advantageous.
引用
收藏
页码:99 / 113
页数:15
相关论文
共 50 条
  • [21] An End-to-End deep learning system for writer identification in handwritten Arabic manuscripts
    Chammas M.
    Makhoul A.
    Demerjian J.
    Dannaoui E.
    Multimedia Tools and Applications, 2024, 83 (18) : 54569 - 54589
  • [22] Multitask Training with Text Data for End-to-End Speech Recognition
    Wang, Peidong
    Sainath, Tara N.
    Weiss, Ron J.
    INTERSPEECH 2021, 2021, : 2566 - 2570
  • [23] Transformer-based end-to-end scene text recognition
    Zhu, Xinghao
    Zhang, Zhi
    PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 1691 - 1695
  • [24] An End-to-End System for Text Extraction in Indian Identity Cards
    Kedlaya, Arjun S.
    Amudha, J.
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION NETWORKS (ICCCN 2021), 2022, 394 : 427 - 436
  • [25] End-to-end entity-aware neural machine translation
    Xie, Shufang
    Xia, Yingce
    Wu, Lijun
    Huang, Yiqing
    Fan, Yang
    Qin, Tao
    MACHINE LEARNING, 2022, 111 (03) : 1181 - 1203
  • [26] End-to-end entity-aware neural machine translation
    Shufang Xie
    Yingce Xia
    Lijun Wu
    Yiqing Huang
    Yang Fan
    Tao Qin
    Machine Learning, 2022, 111 : 1181 - 1203
  • [27] End-to-End Image Classification and Compression With Variational Autoencoders
    Chamain, Lahiru D.
    Qi, Siyu
    Ding, Zhi
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (21): : 21916 - 21931
  • [28] Robust End-to-End Offline Chinese Handwriting Text Page Spotter with Text Kernel
    Wang, Zhihao
    Yu, Yanwei
    Wang, Yibo
    Long, Haixu
    Wang, Fazheng
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT II, 2021, 12917 : 21 - 35
  • [29] Myanmar Text-to-Speech Synthesis Using End-to-End Model
    Qin, Qinglai
    Yang, Jian
    Li, Peiying
    2020 4TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2020, 2020, : 6 - 11
  • [30] EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition
    Hao, Jiedong
    Wen, Yafei
    Deng, Jie
    Gan, Jun
    Ren, Shuai
    Tan, Hui
    Chen, Xiaoxin
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 95 - 108