DYSARTHRIC SPEECH RECOGNITION WITH LATTICE-FREE MMI

被引:0
|
作者
Hermann, Enno [1 ,2 ]
Magimai-Doss, Mathew [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne EPFL, Lausanne, Switzerland
来源
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年
基金
欧盟地平线“2020”;
关键词
Speech recognition; pathological speech processing; dysarthria; LF-MMI; ASR;
D O I
10.1109/icassp40776.2020.9053549
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recognising dysarthric speech is a challenging problem as it differs in many aspects from typical speech, such as speaking rate and pronunciation. In the literature the focus so far has largely been on handling these variabilities in the framework of HMM/GMM and cross-entropy based HMM/DNN systems. This paper focuses on the use of state-of-the-art sequence-discriminative training, in particular lattice-free maximum mutual information (LF-MMI), for improving dysarthric speech recognition. Through a systematic investigation on the Torgo corpus we demonstrate that LF-MMI performs well on such atypical data and compensates much better for the low speaking rates of dysarthric speakers than conventionally trained systems. This can be attributed to inherent aspects of current speech recognition training regimes, like frame subsampling and speed perturbation, which obviate the need for some techniques previously adopted specifically for dysarthric speech.
引用
收藏
页码:6109 / 6113
页数:5
相关论文
共 50 条
  • [21] Hidden Markov modeling in on-line dysarthric speech recognition
    Kostov, A
    Chen, FX
    Beliveau, C
    ADVANCEMENT OF ASSISTIVE TECHNOLOGY, 1997, 3 : 195 - 199
  • [22] TWO-STEP ACOUSTIC MODEL ADAPTATION FOR DYSARTHRIC SPEECH RECOGNITION
    Takashima, Ryoichi
    Takiguchi, Tetsuya
    Ariki, Yasuo
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6104 - 6108
  • [23] Lattice-Free Open Vocabulary Keyword Spotting
    Ramesh, Gundluru
    Doppa, Naveen
    Murty, K. Sri Rama
    2024 NATIONAL CONFERENCE ON COMMUNICATIONS, NCC, 2024,
  • [24] SNR-Selection-Based-Data Augmentation for Dysarthric Speech Recognition
    Nawroly, Sarkhell Sirwan
    Popescu, Decebal Gheorghe
    Antony, Mariya Celin Thekekara
    Philominal, Actlin Jeeva Muthu
    STUDIES IN INFORMATICS AND CONTROL, 2023, 32 (04): : 129 - 140
  • [25] Improving Recognition of Dysarthric Speech Using Severity Based Tempo Adaptation
    Bhat, Chitralekha
    Vachhani, Bhavik
    Kopparapu, Sunil
    Speech and Computer, 2016, 9811 : 370 - 377
  • [26] A Study on Home Automation System for Dysarthric Persons Dependent on Speech Recognition
    Shenbagalakshmi, V
    Jaya, T.
    SECOND INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND COMMUNICATION TECHNOLOGIES, ICCNCT 2019, 2020, 44 : 902 - 909
  • [27] Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition
    Kim, Myungjong
    Kim, Younggwan
    Yoo, Joohong
    Wang, Jun
    Kim, Hoirin
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2017, 25 (09) : 1581 - 1591
  • [28] END-TO-END DYSARTHRIC SPEECH RECOGNITION USING MULTIPLE DATABASES
    Takashima, Yuki
    Takiguchi, Tetsuya
    Ariki, Yasuo
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6395 - 6399
  • [29] Comparing Humans and Automatic Speech Recognition Systems in Recognizing Dysarthric Speech
    Mengistu, Kinfe Tadesse
    Rudzicz, Frank
    ADVANCES IN ARTIFICIAL INTELLIGENCE, 2011, 6657 : 291 - 300
  • [30] Comparison of Noise Reduction Techniques for Dysarthric Speech Recognition
    Mulfari, Davide
    Campobello, Giuseppe
    Gugliandolo, Giovanni
    Celesti, Antonio
    Villari, Massimo
    Donato, Nicola
    2022 IEEE INTERNATIONAL SYMPOSIUM ON MEDICAL MEASUREMENTS AND APPLICATIONS (MEMEA 2022), 2022,