ROLE PLAY DIALOGUE TOPIC MODEL FOR LANGUAGE MODEL ADAPTATION IN MULTI-PARTY CONVERSATION SPEECH RECOGNITION

被引:0
|
作者
Masumura, Ryo [1 ]
Oba, Takanobu [1 ]
Masataki, Hirokazu [1 ]
Yoshioka, Osamu [1 ]
Takahashi, Satoshi [1 ]
机构
[1] NTT Corp, NTT Media Intelligence Labs, Tokyo, Japan
关键词
Unsupervised language model adaptation; multi-party conversation speech recognition; topic model;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper introduces an unsupervised language model adaptation technique for multi-party conversation speech recognition. The use of topic models provides one of the most accurate frameworks for unsupervised language model adaptation since they can inject long-range topic information into language models. However, conventional topic models are not suitable for multi-party conversation because they assume that each speech set has each different topic. In a multi-party conversation, each speaker will share the same conversation topic and each speaker utterance will depend on both topic and speaker role. Accordingly, this paper proposes new concept of the "role play dialogue topic model" to utilize multiparty conversation attributes. The proposed topic model can share the topic distribution among each speaker and can also consider both topic and speaker role. The proposed topic model based adaptation realizes a new framework that sets multiple recognition hypotheses for each speaker and simultaneously adapts a language model for each speaker role. We use a call center dialogue data set in speech recognition experiments to show the effectiveness of the proposed method.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] BAB-QA: A New Neural Model for Emotion Detection in Multi-party Dialogue
    Wang, Zilong
    Wan, Zhaohong
    Wan, Xiaojun
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT I, 2019, 11439 : 210 - 221
  • [22] Language model adaptation in speech recognition using document maps
    Lagus, K
    Kurimo, M
    NEURAL NETWORKS FOR SIGNAL PROCESSING XII, PROCEEDINGS, 2002, : 627 - 636
  • [23] Unsupervised Language Model Adaptation by Data Selection for Speech Recognition
    Khassanov, Yerbolat
    Chong, Tze Yuang
    Bigot, Benjamin
    Chng, Eng Siong
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2017, PT I, 2017, 10191 : 508 - 517
  • [24] Chameleon: A Language Model Adaptation Toolkit for Automatic Speech Recognition of Conversational Speech
    Song, Yuanfeng
    Jiang, Di
    Zhao, Weiwei
    Xu, Qian
    Wong, Raymond Chi-Wing
    Yang, Qiang
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2019, : 37 - 42
  • [25] Real-Time Multimodal Emotion Recognition in Conversation for Multi-Party Interactions
    Rasendrasoa, Sandratra
    Adam, Sebastien
    Pauchet, Alexandre
    Saunier, Julien
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 395 - 403
  • [26] DialogXL: All-in-One XLNet for Multi-Party Conversation Emotion Recognition
    Shen, Weizhou
    Chen, Junqing
    Quan, Xiaojun
    Xie, Zhixian
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13789 - 13797
  • [27] Argumentative Ordering of Utterances for Language Generation in Multi-party Human–Computer Dialogue
    Vladimir Popescu
    Jean Caelen
    Argumentation, 2009, 23 : 205 - 237
  • [28] Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition
    Chen, X.
    Tan, T.
    Liu, X.
    Lanchantin, P.
    Wan, M.
    Gales, M. J. F.
    Woodland, P. C.
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3511 - 3515
  • [29] A model for multi-party negotiations with majority rule
    Zhang, S
    Makedon, F
    Ford, J
    Ai, L
    E-COMMERCE AND WEB TECHNOLOGIES, 2004, 3182 : 228 - 237
  • [30] Unsupervised cross-adaptation approach for speech recognition by combined language model and acoustic model adaptation
    School of Science and Engineering, Yamagata University, Yonezawa, Japan
    APSIPA ASC - Asia-Pac. Signal Inf. Process. Assoc. Annu. Summit Conf., (943-946):