An End-to-End Approach to Automatic Speech Assessment for People with Aphasia

被引：0

作者：

Qin, Ying ^{[1
]}

Lee, Tan ^{[1
]}

Wu, Yuzhong ^{[1
]}

Kong, Anthony Pak Hin ^{[2
]}

机构：

[1] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

[2] Univ Cent Florida, Dept Commun Sci & Disorders, Orlando, FL 32816 USA

来源：

2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2018年

关键词：

Pathological speech assessment; end-to-end; Cantonese; PROGRESSIVE APHASIA;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Conventionally, automatic assessment of pathological speech involves two main steps: (1) extraction of pathology-specific features; (2) classification or regression of extracted features. Given the great variety of speech and language disorders, feature design is never a straightforward task, and yet it is most critical to the performance of assessment. This paper presents an end-to-end approach to automatic speech assessment for Cantonese-speaking people with aphasia (PWA). The assessment is formulated as a binary classification problem to differentiate PWA with high scores of subjective assessment from those with low scores. The sequence-to-one GRU-RNN and CNN models are applied to realize the end-to-end mapping from speech signals to the classification result. The speech features used for assessment are learned implicitly by the neural network model. Preliminary experimental results show that the end-to-end approach could reach a performance level comparable to conventional two-step approach. The experimental results also suggest that CNN performs better than sequence-to-one GRU-RNN in this specific task.

引用

页码：66 / 70

页数：5

共 25 条

[1]

Adam H., 2014, J. Lang. Linguist. Stud., V10, P153

[2]

[Anonymous], 2016, Google's neural machine translation system: Bridging the gap between human and machine translation

[3]

Cho K., 2014, ARXIV, DOI 10.3115/v1/w14-4012

[4] An introduction to ROC analysis [J].

Fawcett, Tom .

PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874

[5]

Fraser K., 2013, P 4 WORKSHOP SPEECH, P47

[6]

Fraser KC, 2013, INTERSPEECH, P2176

[7] Automated classification of primary progressive aphasia subtypes from narrative speech transcripts [J].

Fraser, Kathleen C. ;

Meltzer, Jed A. ;

Graham, Naida L. ;

Leonard, Carol ;

Hirst, Graeme ;

Black, Sandra E. ;

Rochon, Elizabeth .

CORTEX, 2014, 55 :43-60

[8]

Graves A, 2014, PR MACH LEARN RES, V32, P1764

[9]

Hershey S, 2017, INT CONF ACOUST SPEE, P131, DOI 10.1109/ICASSP.2017.7952132

[10]

Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]

← 1 2 3 →