Data Augmentation using Healthy Speech for Dysarthric Speech Recognition

被引:0
|
作者
Vachhani, Bhavik [1 ]
Bhat, Chitralekha [1 ]
Kopparapu, Sunil Kumar [1 ]
机构
[1] TCS Res & Innovat, Mumbai, Maharashtra, India
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
关键词
Dysarthric speech recognition; Data augmentation; Dysarthria severity;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dysarthria refers to a speech disorder caused by trauma to the brain areas concerned with motor aspects of speech giving rise to effortful, slow, slurred or prosodically abnormal speech. Traditional Automatic Speech Recognizers (ASR) perform poorly on dysarthric speech recognition tasks, owing mostly to insufficient dysarthric speech data. Speaker related challenges complicates data collection process for dysarthric speech. In this paper, we explore data augmentation using temporal and speed modifications to healthy speech to simulate dysarthric speech. DNN-HMM based Automatic Speech Recognition (ASR) and Random Forest based classification were used for evaluation of the proposed method. Dysarthric speech, generated synthetically, is classified for severity level using a Random Forest classifier that is trained on actual dysarthric speech. ASR trained on healthy speech, augmented with simulated dysarthric speech is evaluated for dysarthric speech recognition. All evaluations were carried out using Universal Access dysarthric speech corpus. An absolute improvement of 4.24% and 2% WAS achieved using tempo based and speed based data augmentation respectively as compared to ASR performance using healthy speech alone for training.
引用
收藏
页码:471 / 475
页数:5
相关论文
共 50 条
  • [1] Analysis for Using Noise as a Source of Data Augmentation for Dysarthric Speech Recognition
    Nawroly, Sarkhell Sirwan
    Popescu, Decebal
    Celin, T. A. Mariya
    Jeeva, M. P. Actlin
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2025,
  • [2] Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
    Jin, Zengrui
    Geng, Mengzhe
    Deng, Jiajun
    Wang, Tianzi
    Hu, Shujie
    Li, Guinan
    Liu, Xunying
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 413 - 429
  • [3] Few-shot dysarthric speech recognition with text-to-speech data augmentation
    Hermann, Enno
    Magimai-Doss, Mathew
    INTERSPEECH 2023, 2023, : 156 - 160
  • [4] SNR-Selection-Based-Data Augmentation for Dysarthric Speech Recognition
    Nawroly, Sarkhell Sirwan
    Popescu, Decebal Gheorghe
    Antony, Mariya Celin Thekekara
    Philominal, Actlin Jeeva Muthu
    STUDIES IN INFORMATICS AND CONTROL, 2023, 32 (04): : 129 - 140
  • [5] Exploring Alternative Data Augmentation Methods in Dysarthric Automatic Speech Recognition
    Gracelli, Ricardo
    Almeida, Jurandy
    2024 IEEE 37TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, CBMS 2024, 2024, : 243 - 248
  • [6] SIMULATING DYSARTHRIC SPEECH FOR TRAINING DATA AUGMENTATION IN CLINICAL SPEECH APPLICATIONS
    Jiao, Yishan
    Tu, Ming
    Berisha, Visar
    Liss, Julie
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6009 - 6013
  • [7] Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based Augmentation
    Baali, Massa
    Almakky, Ibrahim
    Shehata, Shady
    Karray, Fakhri
    INTERSPEECH 2023, 2023, : 1558 - 1562
  • [8] SYNTHESIZING DYSARTHRIC SPEECH USING MULTI-SPEAKER TTS FOR DYSARTHRIC SPEECH RECOGNITION
    Soleymanpour, Mohammad
    Johnson, Michael T.
    Soleymanpour, Rahim
    Berry, Jeffrey
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7382 - 7386
  • [9] Using speech rhythm knowledge to improve dysarthric speech recognition
    Selouani, S. -A.
    Dahmani, H.
    Amami, R.
    Hamam, H.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (01) : 57 - 64
  • [10] Accurate synthesis of dysarthric Speech for ASR data augmentation
    Soleymanpour, Mohammad
    Johnson, Michael T.
    Soleymanpour, Rahim
    Berry, Jeffrey
    SPEECH COMMUNICATION, 2024, 164