Automatic Intelligibility Assessment of Dysarthric Speech Using Phonologically-Structured Sparse Linear Model

被引：40

作者：

Kim, Myung Jong ^{[1
]}

Kim, Younggwan ^{[1
]}

Kim, Hoirin ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol, Dept Elect Engn, Taejon 305701, South Korea

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2015年 / 23卷 / 04期

基金：

新加坡国家研究基金会;

关键词：

Dysarthria; pronunciation confusion network; speech intelligibility assessment; structured sparse model; weighted finite state transducer (WFST); RECOGNITION; DISORDERS; SPEAKERS; ERRORS;

D O I：

10.1109/TASLP.2015.2403619

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a new method for automatically assessing the speech intelligibility of patients with dysarthria, which is a motor speech disorder impeding the physical production of speech. The proposed method consists of two main steps: feature representation and prediction. In the feature representation step, the speech utterance is converted into a phone sequence using an automatic speech recognition technique and is then aligned with a canonical phone sequence from a pronunciation dictionary using a weighted finite state transducer to capture the pronunciation mappings such as match, substitution, and deletion. The histograms of the pronunciation mappings on a pre-defined word set are used for features. Next, in the prediction step, a structured sparse linear model incorporated with phonological knowledge that simultaneously addresses phonologically structured sparse feature selection and intelligibility prediction is proposed. Evaluation of the proposed method on a database of 109 speakers consisting of 94 dysarthric and 15 control speakers yielded a root mean square error of 8.14 compared to subjectively rated scores in the range of 0 to 100. This is a promising performance in which the system can be successfully applied to help speech therapists in diagnosing the degree of speech disorder.

引用

页码：694 / 704

页数：11

共 47 条

[1]

[Anonymous], P NEUR INF PROC

[2]

[Anonymous], J KOR SOC SPEECH SCI

[3]

[Anonymous], P 52 ANN M ASS COMP

[4]

[Anonymous], P INT C MACH LEARN

[5]

[Anonymous], MANUAL SPEECH SOUND

[6]

[Anonymous], P INT C SPOK LANG PR

[7]

[Anonymous], 2009, Ariz. State Univ.

[8]

[Anonymous], P INT 12 SEP

[9]

[Anonymous], P NAACL HLT LOS ANG

[10]

[Anonymous], P INT 02 DENV CO US

← 1 2 3 4 5 →