Semantic Complexity in End-to-End Spoken Language Understanding

被引:7
|
作者
McKenna, Joseph P. [1 ]
Choudhary, Samridhi [1 ]
Saxon, Michael [1 ]
Strimel, Grant P. [1 ]
Mouchtaris, Athanasios [1 ]
机构
[1] Amazon, Alexa Machine Learning, Seattle, WA 98109 USA
来源
关键词
spoken language understanding; semantic complexity; speech-to-interpretation; NETWORKS;
D O I
10.21437/Interspeech.2020-2929
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
End-to-end spoken language understanding (SLU) models are a class of model architectures that predict semantics directly from speech. Because of their input and output types, we refer to them as speech-to-interpretation (STI) models. Previous works have successfully applied STI models to targeted use cases, such as recognizing home automation commands, however no study has yet addressed how these models generalize to broader use cases. In this work, we analyze the relationship between the performance of STI models and the difficulty of the use case to which they are applied. We introduce empirical measures of dataset semantic complexity to quantify the difficulty of the SLU tasks. We show that near-perfect performance metrics for STI models reported in the literature were obtained with datasets that have low semantic complexity values. We perform experiments where we vary the semantic complexity of a large, proprietary dataset and show that STI model performance correlates with our semantic complexity measures, such that performance increases as complexity values decrease. Our results show that it is important to contextualize an STI model's performance with the complexity values of its training dataset to reveal the scope of its applicability.
引用
收藏
页码:4273 / 4277
页数:5
相关论文
共 50 条
  • [31] TWO-STAGE TEXTUAL KNOWLEDGE DISTILLATION FOR END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Kim, Seongbin
    Kim, Gyuwan
    Shin, Seongjin
    Lee, Sangmin
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7463 - 7467
  • [32] Efficient Adaptation of Spoken Language Understanding based on End-to-End Automatic Speech Recognition
    Kim, Eesung
    Jajodia, Aditya
    Tseng, Cindy
    Neelagiri, Divya
    Ki, Taeyeon
    Apsingekar, Vijendra Raj
    INTERSPEECH 2023, 2023, : 3959 - 3963
  • [33] Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding
    Arora, Siddhant
    Ostapenko, Alissa
    Viswanathan, Vijay
    Dalmia, Siddharth
    Metze, Florian
    Watanabe, Shinji
    Black, Alan W.
    INTERSPEECH 2021, 2021, : 1264 - 1268
  • [34] Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
    Cha, Sujeong
    Hou, Wangrui
    Jung, Hyun
    Phung, My
    Picheny, Michael
    Kuo, Hong-Kwang J.
    Thomas, Samuel
    Morais, Edmilson
    INTERSPEECH 2021, 2021, : 4723 - 4727
  • [35] END-to-END Cross-Lingual Spoken Language Understanding Model with Multilingual Pretraining
    Zhang, Xianwei
    He, Liang
    INTERSPEECH 2021, 2021, : 4728 - 4732
  • [36] USE OF KERNEL DEEP CONVEX NETWORKS AND END-TO-END LEARNING FOR SPOKEN LANGUAGE UNDERSTANDING
    Deng, Li
    Tur, Gokhan
    He, Xiaodong
    Hakkani-Tur, Dilek
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 210 - 215
  • [37] Investigating Adaptation and Transfer Learning for End-to-End Spoken Language Understanding from Speech
    Tomashenko, Natalia
    Caubriere, Antoine
    Esteve, Yannick
    INTERSPEECH 2019, 2019, : 824 - 828
  • [38] Adapting Transformer to End-to-end Spoken Language Translation
    Di Gangi, Mattia A.
    Negri, Matteo
    Turchi, Marco
    INTERSPEECH 2019, 2019, : 1133 - 1137
  • [39] DIALOGUE HISTORY INTEGRATION INTO END-TO-END SIGNAL-TO-CONCEPT SPOKEN LANGUAGE UNDERSTANDING SYSTEMS
    Tomashenko, Natalia
    Raymond, Christian
    Caubriere, Antoine
    De Mori, Renato
    Esteve, Yannick
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8509 - 8513
  • [40] LARGE-SCALE UNSUPERVISED PRE-TRAINING FOR END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Wang, Pengwei
    Wei, Liangchen
    Cao, Yong
    Xie, Jinghui
    Nie, Zaiqing
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7999 - 8003