On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation Modelling

被引:0
作者
Annika Hämäläinen
Lou Boves
Johan de Veth
Louis ten Bosch
机构
[1] Radboud University Nijmegen,Centre for Language and Speech Technology (CLST), Faculty of Arts
来源
EURASIP Journal on Audio, Speech, and Music Processing | / 2007卷
关键词
Acoustics; Speech Recognition; Substantial Effect; Recognition Performance; Considerable Improvement;
D O I
暂无
中图分类号
学科分类号
摘要
Recent research on the TIMIT corpus suggests that longer-length acoustic models are more appropriate for pronunciation variation modelling than the context-dependent phones that conventional automatic speech recognisers use. However, the impressive speech recognition results obtained with longer-length models on TIMIT remain to be reproduced on other corpora. To understand the conditions in which longer-length acoustic models result in considerable improvements in recognition performance, we carry out recognition experiments on both TIMIT and the Spoken Dutch Corpus and analyse the differences between the two sets of results. We establish that the details of the procedure used for initialising the longer-length models have a substantial effect on the speech recognition results. When initialised appropriately, longer-length acoustic models that borrow their topology from a sequence of triphones cannot capture the pronunciation variation phenomena that hinder recognition performance the most.
引用
收藏
相关论文
共 50 条
  • [21] Recent Progresses in Deep Learning Based Acoustic Models
    Yu, Dong
    Li, Jinyu
    [J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2017, 4 (03) : 396 - 409
  • [22] Recent Progresses in Deep Learning Based Acoustic Models
    Dong Yu
    Jinyu Li
    [J]. IEEE/CAA Journal of Automatica Sinica, 2017, 4 (03) : 396 - 409
  • [23] Towards Robust Waveform-Based Acoustic Models
    Oglic, Dino
    Cvetkovic, Zoran
    Sollich, Peter
    Renals, Steve
    Yu, Bin
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1977 - 1992
  • [24] Learning Waveform-Based Acoustic Models Using Deep Variational Convolutional Neural Networks
    Oglic, Dino
    Cvetkovic, Zoran
    Sollich, Peter
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2850 - 2863
  • [25] Decision tree-based acoustic models for speech recognition
    Masami Akamine
    Jitendra Ajmera
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2012
  • [26] Decision tree-based acoustic models for speech recognition
    Akamine, Masami
    Ajmera, Jitendra
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2012,
  • [27] Acoustic Model Adaptation Based on Tensor Analysis of Training Models
    Jeong, Yongwon
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2011, 18 (06) : 347 - 350
  • [28] Conversion from Phoneme Based to Grapheme Based Acoustic Models for Speech Recognition
    Zgank, Andrej
    Kacic, Zdravko
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1587 - 1590
  • [29] Conditional-Computation-Based Recurrent Neural Networks for Computationally Efficient Acoustic Modelling
    Tavarone, Rafaele
    Badino, Leonardo
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1274 - 1278
  • [30] Domain Adaptation of CNN based Acoustic Models under Limited Resource Settings
    Suzuki, Masayuki
    Tachibana, Ryuki
    Thomas, Samuel
    Ramabhadran, Bhuvana
    Saon, George
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1588 - 1592