Multi-Task Learning using Mismatched Transcription for Under-Resourced Speech Recognition

被引:5
|
作者
Van Hai Do [1 ,4 ]
Chen, Nancy E. [2 ]
Lim, Boon Pang [2 ]
Hasegawa-Johnson, Mark [1 ,3 ]
机构
[1] Viettel Grp, Hanoi, Vietnam
[2] ASTAR, Inst Infocomm Res, Singapore, Singapore
[3] Univ Illinois, Urbana, IL USA
[4] Adv Digital Sci Ctr, Singapore, Singapore
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
关键词
mismatched transcription; probabilistic transcription; multi-task learning; low resourced languages; FEATURES; IMPROVE; ASR;
D O I
10.21437/Interspeech.2017-788
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is challenging to obtain large amounts of native (matched) labels for audio in under-resourced languages. This could be due to a lack of literate speakers of the language or a lack of universally acknowledged orthography. One solution is to increase the amount of labeled data by using mismatched transcription, which employs transcribers who do not speak the language (in place of native speakers), to transcribe what they hear as nonsense speech in their own language (e.g.. Mandarin). This paper presents a multi-task learning framework where the DNN acoustic model is simultaneously trained using both a limited amount of native (matched) transcription and a larger set of mismatched transcription. We find that by using a multi-task learning framework, we achieve improvements over monolingual baselines and previously proposed mismatched transcription adaptation techniques. In addition, we show that using alignments provided by a GMM adapted by mismatched transcription further improves acoustic modeling performance. Our experiments on Georgian data from the IARPA Babel program show the effectiveness of the proposed method.
引用
收藏
页码:734 / 738
页数:5
相关论文
共 50 条
  • [21] Coarse-to-Fine Speech Emotion Recognition Based on Multi-Task Learning
    Zhao Huijuan
    Ye Ning
    Wang Ruchuan
    Journal of Signal Processing Systems, 2021, 93 : 299 - 308
  • [22] Coarse-to-Fine Speech Emotion Recognition Based on Multi-Task Learning
    Zhao, Huijuan
    Ye, Ning
    Wang, Ruchuan
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (2-3): : 299 - 308
  • [23] TO REVERSE THE GRADIENT OR NOT: AN EMPIRICAL COMPARISON OF ADVERSARIAL AND MULTI-TASK LEARNING IN SPEECH RECOGNITION
    Adi, Yossi
    Zeghidour, Neil
    Collobert, Ronan
    Usunier, Nicolas
    Liptchinsky, Vitaliy
    Synnaeve, Gabriel
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3742 - 3746
  • [24] Recognition of Latin American Spanish using Multi-task Learning
    Mendes, Carlos
    Abad, Alberto
    Neto, Joao Paulo
    Trancoso, Isabel
    INTERSPEECH 2019, 2019, : 2135 - 2139
  • [25] BaDumTss: Multi-task Learning for Beatbox Transcription
    Mehta, Priya
    Maheshwari, Meet
    Joshi, Brihi
    Chakraborty, Tanmoy
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2022, PT III, 2022, 13282 : 173 - 186
  • [26] Speech based suicide risk recognition for crisis intervention hotlines using explainable multi-task learning
    Ding, Zhong
    Zhou, Yang
    Dai, An-Jie
    Qian, Chen
    Zhong, Bao-Liang
    Liu, Chen-Ling
    Liu, Zhen-Tao
    JOURNAL OF AFFECTIVE DISORDERS, 2025, 370 : 392 - 400
  • [27] Feature-Enhanced Multi-Task Learning for Speech Emotion Recognition Using Decision Trees and LSTM
    Wang, Chun
    Shen, Xizhong
    ELECTRONICS, 2024, 13 (14)
  • [28] A Primary task driven adaptive loss function for multi-task speech emotion recognition
    Liu, Lu-Yao
    Liu, Wen-Zhe
    Feng, Lin
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [29] TASK AWARE MULTI-TASK LEARNING FOR SPEECH TO TEXT TASKS
    Indurthi, Sathish
    Zaidi, Mohd Abbas
    Lakumarapu, Nikhil Kumar
    Lee, Beomseok
    Han, Hyojung
    Ahn, Seokchan
    Kim, Sangha
    Kim, Chanwoo
    Hwang, Inchul
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7723 - 7727
  • [30] Multimodal Sentiment Recognition With Multi-Task Learning
    Zhang, Sun
    Yin, Chunyong
    Yin, Zhichao
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (01): : 200 - 209