Intelligibility Classification of Pathological Speech Using Fusion of Multiple Subsystems

被引：0

作者：

Kim, Jangwon ^{[1
]}

Kumar, Naveen ^{[1
]}

Tsiartas, Andreas ^{[1
]}

Li, Ming ^{[1
]}

Narayanan, Shrikanth S. ^{[1
]}

机构：

[1] Univ So Calif, Signal Anal & Interpretat Lab, Los Angeles, CA 90089 USA

来源：

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 | 2012年

关键词：

pathological speech; intelligibility of speech; fusion of multiple subsystems;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pathological speech usually refers to the condition of speech distortion resulting from atypicalities in voice and/or in the articulatory mechanisms owing to disease, illness or other physical or biological insult to the production system. While automatic evaluation of speech intelligibility and quality could come in handy in these scenarios to assist in diagnosis and treatment design, the many sources and types of variability often make it a very challenging computational processing problem. In this work we design multiple subsystems to address different aspects of pathological speech characteristics. These subsystems are then fused at the binary hard score level (intelligible or not intelligible) using Bayesian networks. Results show that subsystems, such as multiple language phoneme probability system, prosodic and intonational subsystem, and voice quality and pronunciation subsystem, have discriminating power for intelligibility (9.8%, 17.1%, 14.6% higher than by-chance respectively). Noisy-Majority based fusion shows 66.4% accuracy, but the performance improvement by fusion is not made. Also, voice clustering based joint classification is applied to minimize misclassification of the best subsystem, and it shows the best classification accuracy (79.9% on dev set, 76.8% on test set).

引用

页码：534 / 537

页数：4

共 15 条

[1] Automatic intelligibility classification of sentence-level pathological speech
Kim, Jangwon
Kumar, Naveen
Tsiartas, Andreas
Li, Ming
Narayanan, Shrikanth S.
COMPUTER SPEECH AND LANGUAGE, 2015, 29 (01) : 132 - 144
[2] A MIXTURE OF EXPERTS APPROACH TOWARDS INTELLIGIBILITY CLASSIFICATION OF PATHOLOGICAL SPEECH
Gupta, Rahul
Audhkhasi, Kartik
Narayanan, Shrikanth
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 1986 - 1990
[3] A discriminative reliability-aware classification model with applications to intelligibility classification in pathological speech
Kumar, Naveen
Narayanan, Shrikanth S.
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 90 - 94
[4] COMBINING MULTIPLE KERNEL MODELS FOR AUTOMATIC INTELLIGIBILITY DETECTION OF PATHOLOGICAL SPEECH
Huang, Dong-Yan
Dong, Minghui
Li, Haizhou
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6485 - 6489
[5] Automated Intelligibility Assessment of Pathological Speech Using Phonological Features
Middag, Catherine
Martens, Jean-Pierre
Van Nuffelen, Gwen
De Bodt, Marc
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2009,
[6] Automated Intelligibility Assessment of Pathological Speech Using Phonological Features
Catherine Middag
Jean-Pierre Martens
Gwen Van Nuffelen
Marc De Bodt
EURASIP Journal on Advances in Signal Processing, 2009
[7] INTELLIGIBILITY DETECTION OF PATHOLOGICAL SPEECH USING ASYMMETRIC SPARSE KERNEL PARTIAL LEAST SQUARES CLASSIFIER
Huang, Dong-Yan
Dong, Minghui
Li, Haizhou
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[8] Combining phonological and acoustic ASR-free features for pathological speech intelligibility assessment
Middag, Catherine
Bocklet, Tobias
Martens, Jean-Pierre
Noeth, Elmar
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 3016 - +
[9] Automatic Assessment of Speech Intelligibility using Consonant Similarity for Head and Neck Cancer
Quintas, Sebastiao
Mauclair, Julie
Woisard, Virginie
Pinquier, Julien
INTERSPEECH 2022, 2022, : 3608 - 3612
[10] Arabic discourse analysis based on acoustic, prosodic and phonetic modeling: elocution evaluation, speech classification and pathological speech correction
Maraoui, Mohsen
Terbeh, Naim
Zrigui, Mounir
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (04) : 1071 - 1090

← 1 2 →