TASK AWARE MULTI-TASK LEARNING FOR SPEECH TO TEXT TASKS

被引:12
|
作者
Indurthi, Sathish [1 ]
Zaidi, Mohd Abbas [1 ]
Lakumarapu, Nikhil Kumar [1 ]
Lee, Beomseok [1 ]
Han, Hyojung [1 ]
Ahn, Seokchan [1 ]
Kim, Sangha [1 ]
Kim, Chanwoo [1 ]
Hwang, Inchul [1 ]
机构
[1] Samsung Res, Seoul, South Korea
关键词
Speech Translation; Speech Recognition; Task Modulation; Multitask Learning;
D O I
10.1109/ICASSP39728.2021.9414703
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In general, the direct Speech-to-text translation (ST) is jointly trained with Automatic Speech Recognition (ASR), and Machine Translation (MT) tasks. However, the issues with the current joint learning strategies inhibit the knowledge transfer across these tasks. We propose a task modulation network which allows the model to learn task specific features, while learning the shared features simultaneously. This proposed approach removes the need for separate finetuning step resulting in a single model which performs all these tasks. This single model achieves a performance of 28.64 BLEU score on ST MuST-C English-German, WER of 11.61% on ASR TEDLium v3, 23.35 BLEU score on MT WMT'15 English-German task. This sets a new state-of-the-art performance (SOTA) on the ST task while outperforming the existing end-to-end ASR systems.
引用
收藏
页码:7723 / 7727
页数:5
相关论文
共 50 条
  • [21] Task Aware Feature Extraction Framework for Sequential Dependence Multi-Task Learning
    Tao, Xuewen
    Ha, Mingming
    Guo, Xiaobo
    Ma, Qiongxu
    Cheng, Hongwei
    Lin, Wenfang
    Cheng, Linxun
    Han, Bing
    PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 151 - 160
  • [22] Multi-Task Joint Learning for Embedding Aware Audio-Visual Speech Enhancement
    Wang, Chenxi
    Chen, Hang
    Du, Jun
    Yin, Baocai
    Pan, Jia
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 255 - 259
  • [23] Situation Aware Multi-Task Learning for Traffic Prediction
    Deng, Dingxiong
    Shahabi, Cyrus
    Demiryurek, Ugur
    Zhu, Linhong
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 81 - 90
  • [24] A Multi-Task Learning Framework for Abstractive Text Summarization
    Lu, Yao
    Liu, Linqing
    Jiang, Zhile
    Yang, Min
    Goebel, Randy
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9987 - 9988
  • [25] MCapsNet: Capsule Network for Text with Multi-Task Learning
    Xiao, Liqiang
    Zhang, Honglun
    Chen, Wenqing
    Wang, Yongkun
    Jin, Yaohui
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4565 - 4574
  • [26] MULTI-TASK LEARNING IMPROVES SYNTHETIC SPEECH DETECTION
    Mo, Yichuan
    Wang, Shilin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6392 - 6396
  • [27] Towards multi-task learning of speech and speaker recognition
    Vaessen, Nik
    van Leeuwen, David A.
    INTERSPEECH 2023, 2023, : 4898 - 4902
  • [28] Meta Multi-task Learning for Speech Emotion Recognition
    Cai, Ruichu
    Guo, Kaibin
    Xu, Boyan
    Yang, Xiaoyan
    Zhang, Zhenjie
    INTERSPEECH 2020, 2020, : 3336 - 3340
  • [29] Speech Emotion Recognition based on Multi-Task Learning
    Zhao, Huijuan
    Han Zhijie
    Wang, Ruchuan
    2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 186 - 188
  • [30] Multi-task learning with Attention : Constructing auxiliary tasks for learning to learn
    Li, Benying
    Dong, Aimei
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 145 - 152