TASK AWARE MULTI-TASK LEARNING FOR SPEECH TO TEXT TASKS

被引:12
|
作者
Indurthi, Sathish [1 ]
Zaidi, Mohd Abbas [1 ]
Lakumarapu, Nikhil Kumar [1 ]
Lee, Beomseok [1 ]
Han, Hyojung [1 ]
Ahn, Seokchan [1 ]
Kim, Sangha [1 ]
Kim, Chanwoo [1 ]
Hwang, Inchul [1 ]
机构
[1] Samsung Res, Seoul, South Korea
关键词
Speech Translation; Speech Recognition; Task Modulation; Multitask Learning;
D O I
10.1109/ICASSP39728.2021.9414703
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In general, the direct Speech-to-text translation (ST) is jointly trained with Automatic Speech Recognition (ASR), and Machine Translation (MT) tasks. However, the issues with the current joint learning strategies inhibit the knowledge transfer across these tasks. We propose a task modulation network which allows the model to learn task specific features, while learning the shared features simultaneously. This proposed approach removes the need for separate finetuning step resulting in a single model which performs all these tasks. This single model achieves a performance of 28.64 BLEU score on ST MuST-C English-German, WER of 11.61% on ASR TEDLium v3, 23.35 BLEU score on MT WMT'15 English-German task. This sets a new state-of-the-art performance (SOTA) on the ST task while outperforming the existing end-to-end ASR systems.
引用
收藏
页码:7723 / 7727
页数:5
相关论文
共 50 条
  • [41] Ask the GRU: Multi-task Learning for Deep Text Recommendations
    Bansal, Trapit
    Belanger, David
    McCallum, Andrew
    PROCEEDINGS OF THE 10TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'16), 2016, : 107 - 114
  • [42] Multi-task Learning with Bidirectional Language Models for Text Classification
    Yang, Qi
    Shang, Lin
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [43] Power text information extraction based on multi-task learning
    Ji, Xin
    Wu, Tongxin
    Yu, Ting
    Dong, Linxiao
    Chen, Yiting
    Mi, Na
    Zhao, Jiakui
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (08): : 2461 - 2469
  • [44] Multi-task learning using a hybrid representation for text classification
    Guangquan Lu
    Jiangzhang Gan
    Jian Yin
    Zhiping Luo
    Bo Li
    Xishun Zhao
    Neural Computing and Applications, 2020, 32 : 6467 - 6480
  • [45] CoTexT: Multi-task Learning with Code-Text Transformer
    Long Phan
    Hieu Tran
    Le, Daniel
    Hieu Nguyen
    Anibal, James
    Peltekian, Alec
    Ye, Yanfang
    NLP4PROG 2021: THE 1ST WORKSHOP ON NATURAL LANGUAGE PROCESSING FOR PROGRAMMING (NLP4PROG 2021), 2021, : 40 - 47
  • [46] Multi-task learning for historical text normalization: Size matters
    Bollmann, Marcel
    Sogaard, Anders
    Bingel, Joachim
    DEEP LEARNING APPROACHES FOR LOW-RESOURCE NATURAL LANGUAGE PROCESSING (DEEPLO), 2018, : 19 - 24
  • [47] Multi-task learning using a hybrid representation for text classification
    Lu, Guangquan
    Gan, Jiangzhang
    Yin, Jian
    Luo, Zhiping
    Li, Bo
    Zhao, Xishun
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (11): : 6467 - 6480
  • [48] Multi-Task Learning for Text-dependent Speaker Verification
    Chen, Nanxin
    Qian, Yanmin
    Yu, Kai
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 185 - 189
  • [49] MULTI-OBJECTIVE MULTI-TASK LEARNING ON RNNLM FOR SPEECH RECOGNITION
    Song, Minguang
    Zhao, Yunxin
    Wang, Shaojun
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 197 - 203
  • [50] Text Augmentation in a Multi-Task View
    Wei, Jason
    Huang, Chengyu
    Xu, Shiqi
    Vosoughi, Soroush
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2888 - 2894