On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation Modelling

被引：0

作者：

Annika Hämäläinen

Lou Boves

Johan de Veth

Louis ten Bosch

机构：

[1] Radboud University Nijmegen,Centre for Language and Speech Technology (CLST), Faculty of Arts

来源：

EURASIP Journal on Audio, Speech, and Music Processing | / 2007卷

关键词：

Acoustics; Speech Recognition; Substantial Effect; Recognition Performance; Considerable Improvement;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Recent research on the TIMIT corpus suggests that longer-length acoustic models are more appropriate for pronunciation variation modelling than the context-dependent phones that conventional automatic speech recognisers use. However, the impressive speech recognition results obtained with longer-length models on TIMIT remain to be reproduced on other corpora. To understand the conditions in which longer-length acoustic models result in considerable improvements in recognition performance, we carry out recognition experiments on both TIMIT and the Spoken Dutch Corpus and analyse the differences between the two sets of results. We establish that the details of the procedure used for initialising the longer-length models have a substantial effect on the speech recognition results. When initialised appropriately, longer-length acoustic models that borrow their topology from a sequence of triphones cannot capture the pronunciation variation phenomena that hinder recognition performance the most.

引用

共 50 条

[21] Recent Progresses in Deep Learning Based Acoustic Models
Yu, Dong
Li, Jinyu
[J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2017, 4 (03) : 396 - 409
[22] Recent Progresses in Deep Learning Based Acoustic Models
Dong Yu
Jinyu Li
[J]. IEEE/CAA Journal of Automatica Sinica, 2017, 4 (03) : 396 - 409
[23] Towards Robust Waveform-Based Acoustic Models
Oglic, Dino
Cvetkovic, Zoran
Sollich, Peter
Renals, Steve
Yu, Bin
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1977 - 1992
[24] Learning Waveform-Based Acoustic Models Using Deep Variational Convolutional Neural Networks
Oglic, Dino
Cvetkovic, Zoran
Sollich, Peter
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2850 - 2863
[25] Decision tree-based acoustic models for speech recognition
Masami Akamine
Jitendra Ajmera
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2012
[26] Decision tree-based acoustic models for speech recognition
Akamine, Masami
Ajmera, Jitendra
[J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2012,
[27] Acoustic Model Adaptation Based on Tensor Analysis of Training Models
Jeong, Yongwon
[J]. IEEE SIGNAL PROCESSING LETTERS, 2011, 18 (06) : 347 - 350
[28] Conversion from Phoneme Based to Grapheme Based Acoustic Models for Speech Recognition
Zgank, Andrej
Kacic, Zdravko
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1587 - 1590
[29] Conditional-Computation-Based Recurrent Neural Networks for Computationally Efficient Acoustic Modelling
Tavarone, Rafaele
Badino, Leonardo
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1274 - 1278
[30] Domain Adaptation of CNN based Acoustic Models under Limited Resource Settings
Suzuki, Masayuki
Tachibana, Ryuki
Thomas, Samuel
Ramabhadran, Bhuvana
Saon, George
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1588 - 1592

← 1 2 3 4 5 →