Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition

被引:34
|
作者
Kim, Myungjong [1 ]
Kim, Younggwan [2 ]
Yoo, Joohong [2 ]
Wang, Jun [1 ]
Kim, Hoirin [2 ]
机构
[1] Univ Texas Dallas, Dept Bioengn, Richardson, TX 75080 USA
[2] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 305701, South Korea
基金
美国国家卫生研究院; 新加坡国家研究基金会;
关键词
Dysarthria; speech recognition; speaker adaptation; KL-HMM; regularization; KULLBACK-LEIBLER DIVERGENCE; ACOUSTIC MODEL;
D O I
10.1109/TNSRE.2017.2681691
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
This paper addresses the problem of recognizing the speech uttered by patients with dysarthria, which is a motor speech disorder impeding the physical production of speech. Patients with dysarthria have articulatory limitation, and therefore, they often have trouble in pronouncing certain sounds, resulting in undesirable phonetic variation. Modern automatic speech recognition systems designed for regular speakers are ineffective for dysarthric sufferers due to the phonetic variation. To capture the phonetic variation, Kullback-Leibler divergence-based hidden Markov model (KL-HMM) is adopted, where the emission probability of state is parameterized by a categorical distribution using phoneme posterior probabilities obtained from a deep neural network-based acoustic model. To further reflect speaker-specific phonetic variation patterns, a speaker adaptation method based on a combination of L2 regularization and confusion-reducing regularization, which can enhance discriminability between categorical distributions of the KL-HMM states while preserving speaker-specific information is proposed. Evaluation of the proposed speaker adaptation method on a database of several hundred words for 30 speakers consisting of 12 mildly dysarthric, 8 moderately dysarthric, and 10 non-dysarthric control speakers showed that the proposed approach significantly outperformed the conventional deep neural network-based speaker adapted system on dysarthric as well as non-dysarthric speech.
引用
收藏
页码:1581 / 1591
页数:11
相关论文
共 50 条
  • [21] Speech/speaker recognition using a HMM/GMM hybrid model
    Rodriguez, E
    Ruiz, B
    Garcia-Crespo, A
    Garcia, F
    AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, 1997, 1206 : 227 - 234
  • [22] Speaker Dependent Continuous Kannada Speech Recognition Using HMM
    Hemakumar, G.
    Punitha, P.
    2014 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING APPLICATIONS (ICICA 2014), 2014, : 402 - 405
  • [23] AANN-HMM Models for Speaker Verification and Speech Recognition
    Joshi, Sachin
    Prahallad, Kishore
    Yegnanarayana, B.
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 2681 - 2688
  • [24] Speaker clustering and transformation for speaker adaptation in speech recognition systems
    Padmanabhan, M
    Bahl, LR
    Nahamoo, D
    Picheny, MA
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (01): : 71 - 77
  • [25] Rapid speaker adaptation for continuous speech recognition
    Lu, Ping
    Wu, Ji
    Wang, Zuoying
    Lu, Dajin
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2002, 42 (07): : 977 - 980
  • [26] Exploring AI-based Speaker Dependent Methods in Dysarthric Speech Recognition
    Mulfari, Davide
    Celesti, Antonio
    Villari, Massimo
    2022 22ND IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2022), 2022, : 958 - 964
  • [27] SPEAKER ADAPTATION IN A LIMITED SPEECH RECOGNITION SYSTEM
    MAKHOUL, J
    IEEE TRANSACTIONS ON COMPUTERS, 1971, C 20 (09) : 1057 - &
  • [28] DOMAIN AND SPEAKER ADAPTATION FOR CORTANA SPEECH RECOGNITION
    Zhao, Yong
    Li, Jinyu
    Zhang, Shixiong
    Chen, Liping
    Gong, Yifan
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5984 - 5988
  • [29] Quick fMLLR for speaker adaptation in speech recognition
    Varadarajan, Balakrishnan
    Povey, Daniel
    Chu, Stephen M.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4297 - +
  • [30] Speaker Adaptation on Myanmar Spontaneous Speech Recognition
    Naing, Hay Mar Soe
    Pa, Win Pa
    COMPUTATIONAL LINGUISTICS, PACLING 2017, 2018, 781 : 303 - 313