Improving End-to-End Text Image Translation From the Auxiliary Text Translation Task

被引:4
|
作者
Ma, Cong [1 ,2 ]
Zhang, Yaping [1 ,2 ]
Tu, Mei [4 ]
Han, Xu [1 ,2 ]
Wu, Linghui [1 ,2 ]
Zhao, Yang [1 ,2 ]
Zhou, Yu [2 ,3 ]
机构
[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit NLPR, 95 Zhongguan East Rd, Beijing 100190, Peoples R China
[3] Zhongke Fanyu Technol Co Ltd, Fanyu AI Lab, Beijing 100190, Peoples R China
[4] Samsung Res China Beijing SRC B, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
RECOGNITION; SEQUENCE;
D O I
10.1109/ICPR56361.2022.9956695
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
End-to-end text image translation (TIT), which aims at translating the source language embedded in images to the target language, has attracted intensive attention in recent research. However, data sparsity limits the performance of end-to-end text image translation. Multi-task learning is a non-trivial way to alleviate this problem via exploring knowledge from complementary related tasks. In this paper, we propose a novel text translation enhanced text image translation, which trains the end-to-end model with text translation as an auxiliary task. By sharing model parameters and multi-task training, our model is able to take full advantage of easily-available large-scale text parallel corpus. Extensive experimental results show our proposed method outperforms existing end-to-end methods, and the joint multi-task learning with both text translation and recognition tasks achieves better results, proving translation and recognition auxiliary tasks are complementary. (1)
引用
收藏
页码:1664 / 1670
页数:7
相关论文
共 50 条
  • [1] Improving End-to-End Speech Translation by Leveraging Auxiliary Speech and Text Data
    Zhang, Yuhao
    Xu, Chen
    Hu, Bojie
    Zhang, Chunliang
    Xiao, Tong
    Zhu, Jingbo
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13984 - 13992
  • [2] RTNet: An End-to-End Method for Handwritten Text Image Translation
    Su, Tonghua
    Liu, Shuchen
    Zhou, Shengjie
    DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 99 - 113
  • [3] Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task
    Tang, Yun
    Pino, Juan
    Li, Xian
    Wang, Changhan
    Genzel, Dmitriy
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4252 - 4261
  • [4] SimulSpeech: End-to-End Simultaneous Speech to Text Translation
    Ren, Yi
    Liu, Jinglin
    Tan, Xu
    Zhang, Chen
    Qin, Tao
    Zhao, Zhou
    Liu, Tie-Yan
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3787 - 3796
  • [5] A COMPARATIVE STUDY ON END-TO-END SPEECH TO TEXT TRANSLATION
    Bahar, Parnia
    Bieschke, Tobias
    Ney, Hermann
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 792 - 799
  • [6] End-to-End Speech-to-Text Translation: A Survey
    Sethiya, Nivedita
    Maurya, Chandresh Kumar
    COMPUTER SPEECH AND LANGUAGE, 2025, 90
  • [7] Modal Contrastive Learning Based End-to-End Text Image Machine Translation
    Ma, Cong
    Han, Xu
    Wu, Linghui
    Zhang, Yaping
    Zhao, Yang
    Zhou, Yu
    Zong, Chengqing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2153 - 2165
  • [8] Revisiting End-to-End Speech-to-Text Translation From Scratch
    Zhang, Biao
    Haddow, Barry
    Sennrich, Rico
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [9] SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation
    Ma, Xutai
    Pino, Juan
    Koehn, Philipp
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 582 - 587
  • [10] SpecRec: An Alternative Solution for Improving End-to-End Speech-to-Text Translation via Spectrogram Reconstruction
    Chen, Junkun
    Ma, Mingbo
    Zheng, Renjie
    Huang, Liang
    INTERSPEECH 2021, 2021, : 2232 - 2236