RTNet: An End-to-End Method for Handwritten Text Image Translation

被引:4
|
作者
Su, Tonghua [1 ]
Liu, Shuchen [1 ]
Zhou, Shengjie [1 ]
机构
[1] Harbin Inst Technol, Sch Software, Harbin, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine translation; Text recognition; Image text translation; Handwritten text; End-to-End;
D O I
10.1007/978-3-030-86331-9_7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text image recognition and translation have a wide range of applications. It is straightforward to work out a two-stage approach: first perform the text recognition, then translate the text to target language. The handwritten text recognition model and the machine translation model are trained separately. Any transcription error may degrade the translation quality. This paper proposes an end-to-end leaning architecture that directly translates English handwritten text in images into Chinese. The handwriting recognition task and translation task are combined in a unified deep learning model. Firstly we conduct a visual encoding, next bridge the semantic gaps using a feature transformer and finally present a textual decoder to generate the target sentence. To train the model effectively, we use transfer learning to improve the generalization of the model under low-resource conditions. The experiments are carried out to compare our method to the traditional two-stage one. The results indicate that the performance of end-to-end model greatly improved as the amount of training data increases. Furthermore, when larger amount of training data is available, the end-to-end model is more advantageous.
引用
收藏
页码:99 / 113
页数:15
相关论文
共 50 条
  • [1] Improving End-to-End Text Image Translation From the Auxiliary Text Translation Task
    Ma, Cong
    Zhang, Yaping
    Tu, Mei
    Han, Xu
    Wu, Linghui
    Zhao, Yang
    Zhou, Yu
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1664 - 1670
  • [2] An end-to-end handwritten text recognition method using residual attention networks
    Wang Y.-T.
    Zheng H.
    Chang H.-Y.
    Li S.
    Kongzhi yu Juece/Control and Decision, 2023, 38 (07): : 1825 - 1834
  • [3] Modal Contrastive Learning Based End-to-End Text Image Machine Translation
    Ma, Cong
    Han, Xu
    Wu, Linghui
    Zhang, Yaping
    Zhao, Yang
    Zhou, Yu
    Zong, Chengqing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2153 - 2165
  • [4] End-to-End Handwritten Text Detection and Transcription in Full Pages
    Carbonell, Manuel
    Mas, Joan
    Villegas, Mauricio
    Fornes, Alicia
    Llados, Josep
    2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW), VOL 5, 2019, : 29 - 34
  • [5] SimulSpeech: End-to-End Simultaneous Speech to Text Translation
    Ren, Yi
    Liu, Jinglin
    Tan, Xu
    Zhang, Chen
    Qin, Tao
    Zhao, Zhou
    Liu, Tie-Yan
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3787 - 3796
  • [6] A COMPARATIVE STUDY ON END-TO-END SPEECH TO TEXT TRANSLATION
    Bahar, Parnia
    Bieschke, Tobias
    Ney, Hermann
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 792 - 799
  • [7] End-to-End Speech-to-Text Translation: A Survey
    Sethiya, Nivedita
    Maurya, Chandresh Kumar
    COMPUTER SPEECH AND LANGUAGE, 2025, 90
  • [8] End-to-End page-Level assessment of handwritten text recognition
    Vidal, Enrique
    Toselli, Alejandro H.
    Rios-Vila, Antonio
    Calvo-Zaragoza, Jorge
    PATTERN RECOGNITION, 2023, 142
  • [9] Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks
    Junho Jo
    Hyung Il Koo
    Jae Woong Soh
    Nam Ik Cho
    Multimedia Tools and Applications, 2020, 79 : 32137 - 32150
  • [10] End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning
    Malhotra, Ruchika
    Addis, Maru Tesfaye
    IEEE ACCESS, 2023, 11 : 99535 - 99545