On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation Modelling

被引:0
|
作者
Annika Hämäläinen
Lou Boves
Johan de Veth
Louis ten Bosch
机构
[1] Radboud University Nijmegen,Centre for Language and Speech Technology (CLST), Faculty of Arts
来源
EURASIP Journal on Audio, Speech, and Music Processing | / 2007卷
关键词
Acoustics; Speech Recognition; Substantial Effect; Recognition Performance; Considerable Improvement;
D O I
暂无
中图分类号
学科分类号
摘要
Recent research on the TIMIT corpus suggests that longer-length acoustic models are more appropriate for pronunciation variation modelling than the context-dependent phones that conventional automatic speech recognisers use. However, the impressive speech recognition results obtained with longer-length models on TIMIT remain to be reproduced on other corpora. To understand the conditions in which longer-length acoustic models result in considerable improvements in recognition performance, we carry out recognition experiments on both TIMIT and the Spoken Dutch Corpus and analyse the differences between the two sets of results. We establish that the details of the procedure used for initialising the longer-length models have a substantial effect on the speech recognition results. When initialised appropriately, longer-length acoustic models that borrow their topology from a sequence of triphones cannot capture the pronunciation variation phenomena that hinder recognition performance the most.
引用
收藏
相关论文
共 50 条
  • [1] On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation Modelling
    Hamalainen, Annika
    Boves, Lou
    de Veth, Johan
    ten Bosch, Louis
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2007, 2007 (1)
  • [2] Pronunciation Modeling of Loanwords for Korean ASR Using Phonological Knowledge and Syllable-based Segmentation
    Ryu, Hyuksu
    Na, Minsu
    Chung, Minhwa
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 430 - 435
  • [3] Syllable-based automatic Arabic speech recognition
    Azmi, Mohamed Mostafa
    Tolba, Hesham
    Mahdy, Sherif
    Fashal, Mervat
    PROCEEDINGS OF THE 7TH WSEAS INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, ROBOTICS AND AUTOMATION: ADVANCED TOPICS ON SIGNAL PROCESSING, ROBOTICS AND AUTOMATION, 2008, : 246 - +
  • [4] IMPROVED SYLLABLE-BASED CONTINUOUS MANDARINE SPEECH RECOGNITION USING INTERSYLLABLE BOUNDARY MODELS
    CHANG, S
    CHEN, SH
    ELECTRONICS LETTERS, 1995, 31 (11) : 853 - 854
  • [5] Syllable-based large vocabulary continuous speech recognition
    Ganapathiraju, A
    Hamaker, J
    Picone, J
    Ordowski, M
    Doddington, GR
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (04): : 358 - 366
  • [6] Automatic syllable-based phoneme recognition using ESTER corpus
    Le Blouch, Olivier
    Collen, Patrice
    PROCEEDINGS OF THE 7TH WSEAS INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMPUTATIONAL GEOMETRY AND ARTIFICIAL VISION (ISCGAV'-07), 2007, : 77 - +
  • [7] Research on Syllable-Based Language Model in Malay Speech Recognition
    Wei, Xiangfeng
    Zhang, Quan
    Yuan, Yi
    2022 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2022), 2022, : 150 - 155
  • [8] SYLLABLE-BASED SPEECH RECOGNITION USING ELECTROMYOGRAPHY AND DECISION SET CLASSIFIER
    Topalovic, Marko
    Damnjanovic, Dorde
    Peulic, Aleksandar
    Blagojevic, Milan
    Filipovic, Nenad
    BIOMEDICAL ENGINEERING-APPLICATIONS BASIS COMMUNICATIONS, 2015, 27 (02):
  • [9] A study on conventional and syllable-based approaches for automatic speech recognition in Malayalam
    Jasmin, S.
    Samuel, Ashish Abraham
    Rajan, Rajeev
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2022, 47 (04):
  • [10] A study on conventional and syllable-based approaches for automatic speech recognition in Malayalam
    Jasmin S
    Ashish Abraham Samuel
    Rajeev Rajan
    Sādhanā, 47