END-TO-END SPOKEN LANGUAGE UNDERSTANDING WITHOUT MATCHED LANGUAGE SPEECH MODEL PRETRAINING DATA

被引:0
|
作者
Price, Ryan [1 ]
机构
[1] Interactions LLC, Franklin, MA 02038 USA
关键词
spoken language understanding; end-to-end; data augmentation; multilingual; pretraining; RECOGNITION;
D O I
10.1109/icassp40776.2020.9054573
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In contrast to conventional approaches to spoken language understanding (SLU) that consist of cascading a speech recognizer with a natural language understanding component, end-to-end (E2E) approaches for SLU infer semantics directly from the speech signal without processing it through separate subsystems. Pretraining part of the E2E models for speech recognition before finetuning the entire model for the target SLU task has proven to be an effective method to address the increased data requirements of E2E SLU models. However, transcribed corpora in the target language and domain may not always be available for pretraining an E2E SLU model. This paper proposes two strategies to improve the performance of E2E SLU models in scenarios where transcribed data for pretraining in the target language is unavailable: multilingual pretraining with mismatched languages and data augmentation using SpecAugment[1]. We demonstrate the effectiveness of these two methods for E2E SLU on two datasets, including one recently released publicly available dataset where we surpass the best previously published result despite not using any matched language data for pretraining.
引用
收藏
页码:7979 / 7983
页数:5
相关论文
共 50 条
  • [21] EFFICIENT USE OF END-TO-END DATA IN SPOKEN LANGUAGE PROCESSING
    Lu, Yiting
    Wang, Yu
    Gales, Mark J. F.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7518 - 7522
  • [22] IMPROVING END-TO-END MODELS FOR SET PREDICTION IN SPOKEN LANGUAGE UNDERSTANDING
    Kuo, Hong-Kwang J.
    Tuske, Zoltan
    Thomas, Samuel
    Kingsbury, Brian
    Saon, George
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7162 - 7166
  • [23] End-to-End Spoken Language Understanding: Bootstrapping in Low Resource Scenarios
    Bhosale, Swapnil
    Sheikh, Imran
    Dumpala, Sri Harsha
    Kopparapu, Sunil Kumar
    INTERSPEECH 2019, 2019, : 1188 - 1192
  • [24] Integrating Dialog History into End-to-End Spoken Language Understanding Systems
    Ganhotra, Jatin
    Thomas, Samuel
    Kuo, Hong-Kwang J.
    Joshi, Sachindra
    Saon, George
    Tuske, Zoltan
    Kingsbury, Brian
    INTERSPEECH 2021, 2021, : 1254 - 1258
  • [25] END-TO-END ARCHITECTURES FOR ASR-FREE SPOKEN LANGUAGE UNDERSTANDING
    Palogiannidi, Elisavet
    Gkinis, Ioannis
    Mastrapas, George
    Mizera, Petr
    Stafylakis, Themos
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7974 - 7978
  • [26] FROM AUDIO TO SEMANTICS: APPROACHES TO END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Haghani, Parisa
    Narayanan, Arun
    Bacchiani, Michiel
    Chuang, Galen
    Gaur, Neeraj
    Moreno, Pedro
    Prabhavalkar, Rohit
    Qu, Zhongdi
    Waters, Austin
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 720 - 726
  • [27] Toward Low-Cost End-to-End Spoken Language Understanding
    Dinarelli, Marco
    Naguib, Marco
    Portet, Francois
    INTERSPEECH 2022, 2022, : 2728 - 2732
  • [28] Low resource end-to-end spoken language understanding with capsule networks
    Poncelet, Jakob
    Renkens, Vincent
    Van hamme, Hugo
    COMPUTER SPEECH AND LANGUAGE, 2021, 66
  • [29] TOP-DOWN ATTENTION IN END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Chen, Yixin
    Lu, Weiyi
    Mottini, Alejandro
    Li, Li Erran
    Droppo, Jasha
    Du, Zheng
    Zeng, Belinda
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6199 - 6203
  • [30] NON-AUTOREGRESSIVE END-TO-END APPROACHES FOR JOINT AUTOMATIC SPEECH RECOGNITION AND SPOKEN LANGUAGE UNDERSTANDING
    Li, Mohan
    Doddipatla, Rama
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 390 - 397