ROLE PLAY DIALOGUE TOPIC MODEL FOR LANGUAGE MODEL ADAPTATION IN MULTI-PARTY CONVERSATION SPEECH RECOGNITION

被引:0
|
作者
Masumura, Ryo [1 ]
Oba, Takanobu [1 ]
Masataki, Hirokazu [1 ]
Yoshioka, Osamu [1 ]
Takahashi, Satoshi [1 ]
机构
[1] NTT Corp, NTT Media Intelligence Labs, Tokyo, Japan
关键词
Unsupervised language model adaptation; multi-party conversation speech recognition; topic model;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper introduces an unsupervised language model adaptation technique for multi-party conversation speech recognition. The use of topic models provides one of the most accurate frameworks for unsupervised language model adaptation since they can inject long-range topic information into language models. However, conventional topic models are not suitable for multi-party conversation because they assume that each speech set has each different topic. In a multi-party conversation, each speaker will share the same conversation topic and each speaker utterance will depend on both topic and speaker role. Accordingly, this paper proposes new concept of the "role play dialogue topic model" to utilize multiparty conversation attributes. The proposed topic model can share the topic distribution among each speaker and can also consider both topic and speaker role. The proposed topic model based adaptation realizes a new framework that sets multiple recognition hypotheses for each speaker and simultaneously adapts a language model for each speaker role. We use a call center dialogue data set in speech recognition experiments to show the effectiveness of the proposed method.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Language model and speaking rate adaptation for spontaneous presentation speech recognition
    Nanjo, H
    Kawahara, T
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (04): : 391 - 400
  • [42] Efficient Language Model Adaptation for Automatic Speech Recognition of Spoken Translations
    Pelemans, Joris
    Vanallemeersch, Tom
    Demuynck, Kris
    Van Hamme, Hugo
    Wambacq, Patrick
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2262 - 2266
  • [43] A Discourse-Aware Graph Neural Network for Emotion Recognition in Multi-Party Conversation
    Sun, Yang
    Yu, Nan
    Fu, Guohong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2949 - 2958
  • [44] Private-preserving language model inference based on secure multi-party computation
    Song, Chen
    Huang, Ruwei
    Hu, Sai
    NEUROCOMPUTING, 2024, 592
  • [45] Argumentative Ordering of Utterances for Language Generation in Multi-party Human-Computer Dialogue
    Popescu, Vladimir
    Caelen, Jean
    ARGUMENTATION, 2009, 23 (02) : 205 - 237
  • [46] Paragraph Vector Based Topic Model for Language Model Adaptation
    Jin, Wengong
    He, Tianxing
    Qian, Yanmin
    Yu, Kai
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3516 - 3520
  • [47] Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition and Alignment
    Deena, Salil
    Hasan, Madina
    Doulaty, Mortaza
    Saz, Oscar
    Hain, Thomas
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (03) : 572 - 582
  • [48] Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party Speech
    Sklyar, Ilya
    Piunova, Anna
    Osendorfer, Christian
    INTERSPEECH 2022, 2022, : 4451 - 4455
  • [49] A SINGLE-PORT NON-PARAMETRIC MODEL OF TURN-TAKING IN MULTI-PARTY CONVERSATION
    Laskowski, Kornel
    Edlund, Jens
    Heldner, Mattias
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5600 - 5603
  • [50] Acoustic Model Adaptation for Speech Recognition
    Shinoda, Koichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2348 - 2362