TASK AWARE MULTI-TASK LEARNING FOR SPEECH TO TEXT TASKS

被引:15
作者
Indurthi, Sathish [1 ]
Zaidi, Mohd Abbas [1 ]
Lakumarapu, Nikhil Kumar [1 ]
Lee, Beomseok [1 ]
Han, Hyojung [1 ]
Ahn, Seokchan [1 ]
Kim, Sangha [1 ]
Kim, Chanwoo [1 ]
Hwang, Inchul [1 ]
机构
[1] Samsung Res, Seoul, South Korea
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
关键词
Speech Translation; Speech Recognition; Task Modulation; Multitask Learning;
D O I
10.1109/ICASSP39728.2021.9414703
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In general, the direct Speech-to-text translation (ST) is jointly trained with Automatic Speech Recognition (ASR), and Machine Translation (MT) tasks. However, the issues with the current joint learning strategies inhibit the knowledge transfer across these tasks. We propose a task modulation network which allows the model to learn task specific features, while learning the shared features simultaneously. This proposed approach removes the need for separate finetuning step resulting in a single model which performs all these tasks. This single model achieves a performance of 28.64 BLEU score on ST MuST-C English-German, WER of 11.61% on ASR TEDLium v3, 23.35 BLEU score on MT WMT'15 English-German task. This sets a new state-of-the-art performance (SOTA) on the ST task while outperforming the existing end-to-end ASR systems.
引用
收藏
页码:7723 / 7727
页数:5
相关论文
共 23 条
[1]  
[Anonymous], 2019, IWSLT 2019 EVALUATIO
[2]  
[Anonymous], 2020, ACL IWSLT
[3]  
[Anonymous], 2018, ICASSP
[4]  
[Anonymous], 2018, ICASSP
[5]  
Ardila R., 2020, Common voice: A massively-multilingual speech corpus
[6]   Multitask learning [J].
Caruana, R .
MACHINE LEARNING, 1997, 28 (01) :41-75
[7]  
Di Gangi Mattia A., 2019, NAACL, P2012
[8]  
Finn C, 2017, PR MACH LEARN RES, V70
[9]  
Gu Jiatao, 2018, P 2018 C N AM CHAPT, V1, P344, DOI [DOI 10.18653/V1/N18-1032, 10.18653/v1/N18-1032]
[10]  
Indurthi S, 2020, INT CONF ACOUST SPEE, P7904, DOI [10.1109/icassp40776.2020.9054759, 10.1109/ICASSP40776.2020.9054759]