An End-to-End Approach to Automatic Speech Assessment for People with Aphasia

被引:0
作者
Qin, Ying [1 ]
Lee, Tan [1 ]
Wu, Yuzhong [1 ]
Kong, Anthony Pak Hin [2 ]
机构
[1] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
[2] Univ Cent Florida, Dept Commun Sci & Disorders, Orlando, FL 32816 USA
来源
2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2018年
关键词
Pathological speech assessment; end-to-end; Cantonese; PROGRESSIVE APHASIA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Conventionally, automatic assessment of pathological speech involves two main steps: (1) extraction of pathology-specific features; (2) classification or regression of extracted features. Given the great variety of speech and language disorders, feature design is never a straightforward task, and yet it is most critical to the performance of assessment. This paper presents an end-to-end approach to automatic speech assessment for Cantonese-speaking people with aphasia (PWA). The assessment is formulated as a binary classification problem to differentiate PWA with high scores of subjective assessment from those with low scores. The sequence-to-one GRU-RNN and CNN models are applied to realize the end-to-end mapping from speech signals to the classification result. The speech features used for assessment are learned implicitly by the neural network model. Preliminary experimental results show that the end-to-end approach could reach a performance level comparable to conventional two-step approach. The experimental results also suggest that CNN performs better than sequence-to-one GRU-RNN in this specific task.
引用
收藏
页码:66 / 70
页数:5
相关论文
共 25 条
[1]  
Adam H., 2014, J. Lang. Linguist. Stud., V10, P153
[2]  
[Anonymous], 2016, Google's neural machine translation system: Bridging the gap between human and machine translation
[3]  
Cho K., 2014, ARXIV, DOI 10.3115/v1/w14-4012
[4]   An introduction to ROC analysis [J].
Fawcett, Tom .
PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874
[5]  
Fraser K., 2013, P 4 WORKSHOP SPEECH, P47
[6]  
Fraser KC, 2013, INTERSPEECH, P2176
[7]   Automated classification of primary progressive aphasia subtypes from narrative speech transcripts [J].
Fraser, Kathleen C. ;
Meltzer, Jed A. ;
Graham, Naida L. ;
Leonard, Carol ;
Hirst, Graeme ;
Black, Sandra E. ;
Rochon, Elizabeth .
CORTEX, 2014, 55 :43-60
[8]  
Graves A, 2014, PR MACH LEARN RES, V32, P1764
[9]  
Hershey S, 2017, INT CONF ACOUST SPEE, P131, DOI 10.1109/ICASSP.2017.7952132
[10]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]