TASK AWARE MULTI-TASK LEARNING FOR SPEECH TO TEXT TASKS

被引:12
|
作者
Indurthi, Sathish [1 ]
Zaidi, Mohd Abbas [1 ]
Lakumarapu, Nikhil Kumar [1 ]
Lee, Beomseok [1 ]
Han, Hyojung [1 ]
Ahn, Seokchan [1 ]
Kim, Sangha [1 ]
Kim, Chanwoo [1 ]
Hwang, Inchul [1 ]
机构
[1] Samsung Res, Seoul, South Korea
关键词
Speech Translation; Speech Recognition; Task Modulation; Multitask Learning;
D O I
10.1109/ICASSP39728.2021.9414703
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In general, the direct Speech-to-text translation (ST) is jointly trained with Automatic Speech Recognition (ASR), and Machine Translation (MT) tasks. However, the issues with the current joint learning strategies inhibit the knowledge transfer across these tasks. We propose a task modulation network which allows the model to learn task specific features, while learning the shared features simultaneously. This proposed approach removes the need for separate finetuning step resulting in a single model which performs all these tasks. This single model achieves a performance of 28.64 BLEU score on ST MuST-C English-German, WER of 11.61% on ASR TEDLium v3, 23.35 BLEU score on MT WMT'15 English-German task. This sets a new state-of-the-art performance (SOTA) on the ST task while outperforming the existing end-to-end ASR systems.
引用
收藏
页码:7723 / 7727
页数:5
相关论文
共 50 条
  • [31] Virtual Tasks but Real Gains: Improving Multi-Task Learning
    Pranavan, Theivendiram
    Sim, Terence
    Li, Jianshu
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4829 - 4836
  • [32] Unsupervised Joint Multi-Task Learning of Vision Geometry Tasks
    Jha, Prabhash Kumar
    Tsanev, Doychin
    Lukic, Luka
    2021 IEEE INTELLIGENT VEHICLES SYMPOSIUM WORKSHOPS (IV WORKSHOPS), 2021, : 215 - 221
  • [33] Learning Sparse Task Relations in Multi-Task Learning
    Zhang, Yu
    Yang, Qiang
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2914 - 2920
  • [34] Task Variance Regularized Multi-Task Learning
    Mao, Yuren
    Wang, Zekai
    Liu, Weiwei
    Lin, Xuemin
    Hu, Wenbin
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) : 8615 - 8629
  • [35] Task Switching Network for Multi-task Learning
    Sun, Guolei
    Probst, Thomas
    Paudel, Danda Pani
    Popovic, Nikola
    Kanakis, Menelaos
    Patel, Jagruti
    Dai, Dengxin
    Van Gool, Luc
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8271 - 8280
  • [36] Learning to Teach Fairness-Aware Deep Multi-task Learning
    Roy, Arjun
    Ntoutsi, Eirini
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I, 2023, 13713 : 710 - 726
  • [37] A Unified Accent Estimation Method Based on Multi-Task Learning for Japanese Text-to-Speech
    Park, Byeongseon
    Yamamoto, Ryuichi
    Tachibana, Kentaro
    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2022, 2022-September : 1931 - 1935
  • [38] A Unified Accent Estimation Method Based on Multi-Task Learning for Japanese Text-to-Speech
    Park, Byeongseon
    Yamamoto, Ryuichi
    Tachibana, Kentaro
    INTERSPEECH 2022, 2022, : 1931 - 1935
  • [39] Density-Aware Multi-Task Learning for Crowd Counting
    Jiang, Xiaoheng
    Zhang, Li
    Zhang, Tianzhu
    Lv, Pei
    Zhou, Bing
    Pang, Yanwei
    Xu, Mingliang
    Xu, Changsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 443 - 453
  • [40] Estimating the influence of auxiliary tasks for multi-task learning of sequence tagging tasks
    Schroeder, Fynn
    Biemann, Chris
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2971 - 2985