DYSARTHRIC SPEECH RECOGNITION WITH LATTICE-FREE MMI

被引：0

作者：

Hermann, Enno ^{[1
,2
]}

Magimai-Doss, Mathew ^{[1
]}

机构：

[1] Idiap Res Inst, Martigny, Switzerland

[2] Ecole Polytech Fed Lausanne EPFL, Lausanne, Switzerland

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

基金：

欧盟地平线“2020”;

关键词：

Speech recognition; pathological speech processing; dysarthria; LF-MMI; ASR;

D O I：

10.1109/icassp40776.2020.9053549

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Recognising dysarthric speech is a challenging problem as it differs in many aspects from typical speech, such as speaking rate and pronunciation. In the literature the focus so far has largely been on handling these variabilities in the framework of HMM/GMM and cross-entropy based HMM/DNN systems. This paper focuses on the use of state-of-the-art sequence-discriminative training, in particular lattice-free maximum mutual information (LF-MMI), for improving dysarthric speech recognition. Through a systematic investigation on the Torgo corpus we demonstrate that LF-MMI performs well on such atypical data and compensates much better for the low speaking rates of dysarthric speakers than conventionally trained systems. This can be attributed to inherent aspects of current speech recognition training regimes, like frame subsampling and speed perturbation, which obviate the need for some techniques previously adopted specifically for dysarthric speech.

引用

页码：6109 / 6113

页数：5

共 50 条

[31] Domain Adversarial Neural Networks for Dysarthric Speech Recognition
Woszczyk, Dominika
Petridis, Stavros
Millard, David
INTERSPEECH 2020, 2020, : 3875 - 3879
[32] ON THE USE OF HIDDEN MARKOV MODELING FOR RECOGNITION OF DYSARTHRIC SPEECH
DELLER, JR
HSU, D
FERRIER, LJ
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 1991, 35 (02) : 125 - 139
[33] Recent Progress in the CUHK Dysarthric Speech Recognition System
Liu, Shansong
Geng, Mengzhe
Hu, Shoukang
Xie, Xurong
Cui, Mingyu
Yu, Jianwei
Liu, Xunying
Meng, Helen
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2267 - 2281
[34] Handling acoustic variation in dysarthric speech recognition systems through model combination
Hermann, Enno
Magimai-Doss, Mathew
INTERSPEECH 2021, 2021, : 4788 - 4792
[35] Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based Augmentation
Baali, Massa
Almakky, Ibrahim
Shehata, Shady
Karray, Fakhri
INTERSPEECH 2023, 2023, : 1558 - 1562
[36] Knowledge Transferability Between the Speech Data of Persons With Dysarthria Speaking Different Languages for Dysarthric Speech Recognition
Takashima, Yuki
Takashima, Ryoichi
Takiguchi, Tetsuya
Ariki, Yasuo
IEEE ACCESS, 2019, 7 : 164320 - 164326
[37] Development of the CUHK Dysarthric Speech Recognition System for the UASpeech Corpus
Yu, Jianwei
Xie, Xurong
Liu, Shansong
Hu, Shoukang
Lam, Max W. Y.
Wu, Xixin
Wong, Ka Ho
Liu, Xunying
Meng, Helen
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2938 - 2942
[38] Familiarisation conditions and the mechanisms that underlie improved recognition of dysarthric speech
Borrie, Stephanie A.
McAuliffe, Megan J.
Liss, Julie M.
Kirk, Cecilia
O'Beirne, Gregory A.
Anderson, Tim
LANGUAGE AND COGNITIVE PROCESSES, 2012, 27 (7-8): : 1039 - 1055
[39] EXPERIMENTS IN DYSARTHRIC SPEECH RECOGNITION USING ARTIFICIAL NEURAL NETWORKS
JAYARAM, G
ABDELHAMIED, K
JOURNAL OF REHABILITATION RESEARCH AND DEVELOPMENT, 1995, 32 (02): : 162 - 169
[40] Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Jin, Zengrui
Geng, Mengzhe
Deng, Jiajun
Wang, Tianzi
Hu, Shujie
Li, Guinan
Liu, Xunying
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 413 - 429

← 1 2 3 4 5 →