Classification of Dysarthric Speech According to the Severity of Impairment: an Analysis of Acoustic Features

被引：32

作者：

Al-Qatab, Bassam Ali ^{[1
]}

Mustafa, Mumtaz Begum ^{[1
]}

机构：

[1] Univ Malaya, Fac Comp Sci & Informat Technol, Dept Software Engn, Kuala Lumpur 50603, Malaysia

来源：

IEEE ACCESS | 2021年 / 9卷

关键词：

Feature extraction; Speech recognition; Frequency modulation; Databases; Classification algorithms; Frequency measurement; Muscles; Acoustic features; automatic dysarthric speech recognition system; dysarthria; classification algorithms; feature selection methods; INTELLIGIBILITY; MODELS;

D O I：

10.1109/ACCESS.2021.3053335

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The automatic speech recognition (ASR) system is increasingly being applied as assistive technology in the speech impaired community, for individuals with physical disabilities such as dysarthric speakers. However, the effectiveness of the ASR system in recognizing dysarthric speech can be disadvantaged by data sparsity, either in the coverage of the language, or the size of the existing speech database, not counting the severity of the speech impairment. This study examines the acoustic features and feature selection methods that can be used to improve the classification of dysarthric speech, based on the severity of the impairment. For the purpose of this study, we incorporated four acoustic features including prosody, spectral, cepstral, and voice quality and seven feature selection methods which encompassed Interaction Capping (ICAP), Conditional Information Feature Extraction (CIFE), Conditional Mutual Information Maximization (CMIM), Double Input Symmetrical Relevance (DISR), Joint Mutual Information (JMI), Conditional redundancy (Condred) and Relief. Further to that, we engaged six classification algorithms like Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Artificial Neural Network (ANN), Classification and Regression Tree (CART), Naive Bayes (NB), and Random Forest (RF) in our experiment. The classification accuracy of our experiments ranges from 40.41% to 95.80%.

引用

页码：18183 / 18194

页数：12

共 63 条

[1]

[Anonymous], 2005, Machine learning based on attribute interactions

[2]

[Anonymous], 2013, Intelligent audio analysis

[3]

[Anonymous], 2011, Acm T. Intel. Syst. Tec., DOI DOI 10.1145/1961189.1961199

[4] Speech Intelligibility in Dysarthrias: Influence of Utterance Length [J].

Barreto, Simone dos Santos ;

Ortiz, Karin Zazo .

FOLIA PHONIATRICA ET LOGOPAEDICA, 2020, 72 (03) :202-210

[5]

Betkowska Cavalcante A., 2016, INT J ADV INTELL SYS, V9, P589

[6]

Brazdil PB, 2000, LECT NOTES ARTIF INT, V1810, P63

[7]

Brown G, 2012, J MACH LEARN RES, V13, P27

[8] Evaluation of an Automatic Speech Recognition Platform for Dysarthric Speech [J].

Calvo, Irene ;

Tropea, Peppino ;

Vigano, Mauro ;

Scialla, Maria ;

Cavalcante, Agnieszka B. ;

Grajzer, Monika ;

Gilardone, Marco ;

Corbo, Massimo .

FOLIA PHONIATRICA ET LOGOPAEDICA, 2021, 73 (05) :432-441

[9]

Castillo G. E., 2013, P 25 ANN INT C ENG M, P2257

[10] The Impact of Contrastive Stress on Vowel Acoustics and Intelligibility in Dysarthria [J].

Connaghan, Kathryn P. ;

Patel, Rupal .

JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2017, 60 (01) :38-50

← 1 2 3 4 5 6 7 →