TASK AWARE MULTI-TASK LEARNING FOR SPEECH TO TEXT TASKS

被引：12

作者：

Indurthi, Sathish ^{[1
]}

Zaidi, Mohd Abbas ^{[1
]}

Lakumarapu, Nikhil Kumar ^{[1
]}

Lee, Beomseok ^{[1
]}

Han, Hyojung ^{[1
]}

Ahn, Seokchan ^{[1
]}

Kim, Sangha ^{[1
]}

Kim, Chanwoo ^{[1
]}

Hwang, Inchul ^{[1
]}

机构：

[1] Samsung Res, Seoul, South Korea

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

Speech Translation; Speech Recognition; Task Modulation; Multitask Learning;

D O I：

10.1109/ICASSP39728.2021.9414703

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In general, the direct Speech-to-text translation (ST) is jointly trained with Automatic Speech Recognition (ASR), and Machine Translation (MT) tasks. However, the issues with the current joint learning strategies inhibit the knowledge transfer across these tasks. We propose a task modulation network which allows the model to learn task specific features, while learning the shared features simultaneously. This proposed approach removes the need for separate finetuning step resulting in a single model which performs all these tasks. This single model achieves a performance of 28.64 BLEU score on ST MuST-C English-German, WER of 11.61% on ASR TEDLium v3, 23.35 BLEU score on MT WMT'15 English-German task. This sets a new state-of-the-art performance (SOTA) on the ST task while outperforming the existing end-to-end ASR systems.

引用

页码：7723 / 7727

页数：5

共 50 条

[41] Ask the GRU: Multi-task Learning for Deep Text Recommendations
Bansal, Trapit
Belanger, David
McCallum, Andrew
PROCEEDINGS OF THE 10TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'16), 2016, : 107 - 114
[42] Multi-task Learning with Bidirectional Language Models for Text Classification
Yang, Qi
Shang, Lin
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[43] Power text information extraction based on multi-task learning
Ji, Xin
Wu, Tongxin
Yu, Ting
Dong, Linxiao
Chen, Yiting
Mi, Na
Zhao, Jiakui
Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (08): : 2461 - 2469
[44] Multi-task learning using a hybrid representation for text classification
Guangquan Lu
Jiangzhang Gan
Jian Yin
Zhiping Luo
Bo Li
Xishun Zhao
Neural Computing and Applications, 2020, 32 : 6467 - 6480
[45] CoTexT: Multi-task Learning with Code-Text Transformer
Long Phan
Hieu Tran
Le, Daniel
Hieu Nguyen
Anibal, James
Peltekian, Alec
Ye, Yanfang
NLP4PROG 2021: THE 1ST WORKSHOP ON NATURAL LANGUAGE PROCESSING FOR PROGRAMMING (NLP4PROG 2021), 2021, : 40 - 47
[46] Multi-task learning for historical text normalization: Size matters
Bollmann, Marcel
Sogaard, Anders
Bingel, Joachim
DEEP LEARNING APPROACHES FOR LOW-RESOURCE NATURAL LANGUAGE PROCESSING (DEEPLO), 2018, : 19 - 24
[47] Multi-task learning using a hybrid representation for text classification
Lu, Guangquan
Gan, Jiangzhang
Yin, Jian
Luo, Zhiping
Li, Bo
Zhao, Xishun
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (11): : 6467 - 6480
[48] Multi-Task Learning for Text-dependent Speaker Verification
Chen, Nanxin
Qian, Yanmin
Yu, Kai
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 185 - 189
[49] MULTI-OBJECTIVE MULTI-TASK LEARNING ON RNNLM FOR SPEECH RECOGNITION
Song, Minguang
Zhao, Yunxin
Wang, Shaojun
2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 197 - 203
[50] Text Augmentation in a Multi-Task View
Wei, Jason
Huang, Chengyu
Xu, Shiqi
Vosoughi, Soroush
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2888 - 2894

← 1 2 3 4 5 →