DYSARTHRIC SPEECH RECOGNITION WITH LATTICE-FREE MMI

被引:0
作者
Hermann, Enno [1 ,2 ]
Magimai-Doss, Mathew [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne EPFL, Lausanne, Switzerland
来源
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年
基金
欧盟地平线“2020”;
关键词
Speech recognition; pathological speech processing; dysarthria; LF-MMI; ASR;
D O I
10.1109/icassp40776.2020.9053549
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recognising dysarthric speech is a challenging problem as it differs in many aspects from typical speech, such as speaking rate and pronunciation. In the literature the focus so far has largely been on handling these variabilities in the framework of HMM/GMM and cross-entropy based HMM/DNN systems. This paper focuses on the use of state-of-the-art sequence-discriminative training, in particular lattice-free maximum mutual information (LF-MMI), for improving dysarthric speech recognition. Through a systematic investigation on the Torgo corpus we demonstrate that LF-MMI performs well on such atypical data and compensates much better for the low speaking rates of dysarthric speakers than conventionally trained systems. This can be attributed to inherent aspects of current speech recognition training regimes, like frame subsampling and speed perturbation, which obviate the need for some techniques previously adopted specifically for dysarthric speech.
引用
收藏
页码:6109 / 6113
页数:5
相关论文
共 50 条
  • [31] Domain Adversarial Neural Networks for Dysarthric Speech Recognition
    Woszczyk, Dominika
    Petridis, Stavros
    Millard, David
    INTERSPEECH 2020, 2020, : 3875 - 3879
  • [32] ON THE USE OF HIDDEN MARKOV MODELING FOR RECOGNITION OF DYSARTHRIC SPEECH
    DELLER, JR
    HSU, D
    FERRIER, LJ
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 1991, 35 (02) : 125 - 139
  • [33] Recent Progress in the CUHK Dysarthric Speech Recognition System
    Liu, Shansong
    Geng, Mengzhe
    Hu, Shoukang
    Xie, Xurong
    Cui, Mingyu
    Yu, Jianwei
    Liu, Xunying
    Meng, Helen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2267 - 2281
  • [34] Handling acoustic variation in dysarthric speech recognition systems through model combination
    Hermann, Enno
    Magimai-Doss, Mathew
    INTERSPEECH 2021, 2021, : 4788 - 4792
  • [35] Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based Augmentation
    Baali, Massa
    Almakky, Ibrahim
    Shehata, Shady
    Karray, Fakhri
    INTERSPEECH 2023, 2023, : 1558 - 1562
  • [36] Knowledge Transferability Between the Speech Data of Persons With Dysarthria Speaking Different Languages for Dysarthric Speech Recognition
    Takashima, Yuki
    Takashima, Ryoichi
    Takiguchi, Tetsuya
    Ariki, Yasuo
    IEEE ACCESS, 2019, 7 : 164320 - 164326
  • [37] Development of the CUHK Dysarthric Speech Recognition System for the UASpeech Corpus
    Yu, Jianwei
    Xie, Xurong
    Liu, Shansong
    Hu, Shoukang
    Lam, Max W. Y.
    Wu, Xixin
    Wong, Ka Ho
    Liu, Xunying
    Meng, Helen
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2938 - 2942
  • [38] Familiarisation conditions and the mechanisms that underlie improved recognition of dysarthric speech
    Borrie, Stephanie A.
    McAuliffe, Megan J.
    Liss, Julie M.
    Kirk, Cecilia
    O'Beirne, Gregory A.
    Anderson, Tim
    LANGUAGE AND COGNITIVE PROCESSES, 2012, 27 (7-8): : 1039 - 1055
  • [39] EXPERIMENTS IN DYSARTHRIC SPEECH RECOGNITION USING ARTIFICIAL NEURAL NETWORKS
    JAYARAM, G
    ABDELHAMIED, K
    JOURNAL OF REHABILITATION RESEARCH AND DEVELOPMENT, 1995, 32 (02): : 162 - 169
  • [40] Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
    Jin, Zengrui
    Geng, Mengzhe
    Deng, Jiajun
    Wang, Tianzi
    Hu, Shujie
    Li, Guinan
    Liu, Xunying
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 413 - 429