Prosodic and temporal features for language modeling for dialog

被引:10
|
作者
Ward, Nigel G. [1 ]
Vega, Alejandro [1 ]
Baumann, Timo [2 ]
机构
[1] Univ Texas El Paso, El Paso, TX 79968 USA
[2] Univ Potsdam, Dept Linguist, D-14476 Potsdam, Germany
基金
美国国家科学基金会;
关键词
Dialog dynamics; Dialog state; Prosody; Interlocutor behavior; Word probabilities; Prediction; Perplexity; Speech recognition; Switchboard corpus; Verbmobil corpus; SPEECH RECOGNITION;
D O I
10.1016/j.specom.2011.07.009
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
If we can model the cognitive and communicative processes underlying speech, we should be able to better predict what a speaker will do. With this idea as inspiration, we examine a number of prosodic and timing features as potential sources of information on what words the speaker is likely to say next. In spontaneous dialog we find that word probabilities do vary with such features. Using perplexity as the metric, the most informative of these included recent speaking rate, volume, and pitch, and time until end of utterance. Using simple combinations of such features to augment trigram language models gave up to a 8.4% perplexity benefit on the Switchboard corpus, and up to a 1.0% relative reduction in word error rate (0.3% absolute) on the Verbmobil II corpus. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:161 / 174
页数:14
相关论文
共 50 条
  • [21] The prosodic framework for language learning
    Fee, EJ
    TOPICS IN LANGUAGE DISORDERS, 1997, 17 (04) : 53 - 62
  • [22] Prosodic features of polite speech
    Brown, Lucien
    Oh, Grace Eunhae
    Idemaru, Kaori
    PRAGMATICS, 2024,
  • [23] Prosodic features of stances in conversation
    Freeman, Valerie
    LABORATORY PHONOLOGY, 2019, 10 (01):
  • [24] Automatically predicting dialogue structure using prosodic features
    Hastie, HW
    Poesio, M
    Isard, S
    SPEECH COMMUNICATION, 2002, 36 (1-2) : 63 - 79
  • [25] Prosodic Features for Speaker Verification
    Mary, Leena
    Yegnanarayana, B.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 917 - 920
  • [26] Prosodic Features' Criterion for Hebrew
    Fishman, Ben
    Lapidot, Itshak
    Opher, Irit
    TEXT, SPEECH, AND DIALOGUE (TSD 2018), 2018, 11107 : 482 - 491
  • [27] An Effective Contextual Language Modeling Framework for Speech Summarization with Augmented Features
    Weng, Shi-Yan
    Lo, Tien-Hong
    Chen, Berlin
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 316 - 320
  • [28] YembaTones: A syllable-tone annotated dataset for speech recognition and prosodic analysis of the Yemba language
    Jeuguim, Marc Sturm Kenfack
    Yonta, Paulin Melatagia
    Sandembouo, Etienne
    DATA IN BRIEF, 2024, 52
  • [29] Latent Prosodic Modeling (LPM) for Speech with Applications in Recognizing Spontaneous Mandarin Speech with Disfluencies
    Lin, Che-Kuang
    Lee, Lin-Shan
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2390 - 2393
  • [30] A Joint Prosodic Origin of Language and Music
    Brown, Steven
    FRONTIERS IN PSYCHOLOGY, 2017, 8