DETECTING ALZHEIMER'S DISEASE FROM SPEECH USING NEURAL NETWORKS WITH BOTTLENECK FEATURES AND DATA AUGMENTATION

被引:17
作者
Liu, Zhaoci [1 ]
Guo, Zhiqiang [1 ]
Ling, Zhenhua [1 ]
Li, Yunxia [2 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China
[2] Tongji Univ, Shanghai Tongji Hosp, Sch Med, Shanghai, Peoples R China
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
关键词
Alzheimer's disease; speech analysis; neural networks; bottleneck features; data augmentation; AUTOMATIC DIAGNOSIS;
D O I
10.1109/ICASSP39728.2021.9413566
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a method of detecting Alzheimer's disease (AD) from the spontaneous speech of subjects in a picture description task using neural networks. This method does not rely on the manual transcriptions and annotations of a subject's speech, but utilizes the bottleneck features extracted from audio using an ASR model. The neural network contains convolutional neural network (CNN) layers for local context modeling, bidirectional long short-term memory (BiLSTM) layers for global context modeling and an attention pooling layer for classification. Furthermore, a masking-based data augmentation method is designed to deal with the data scarcity problem. Experiments on the DementiaBank dataset show that the detection accuracy of our proposed method is 82.59%, which is better than the baseline method based on manually-designed acoustic features and support vector machines (SVM), and achieves the state-of-the-art performance of detecting AD using only audio data on this dataset.
引用
收藏
页码:7323 / 7327
页数:5
相关论文
共 25 条
[1]   THE NATURAL-HISTORY OF ALZHEIMERS-DISEASE - DESCRIPTION OF STUDY COHORT AND ACCURACY OF DIAGNOSIS [J].
BECKER, JT ;
BOLLER, F ;
LOPEZ, OL ;
SAXTON, J ;
MCGONIGLE, KL ;
MOOSSY, J ;
HANIN, I ;
WOLFSON, SK ;
DETRE, K ;
HOLLAND, A ;
GUR, D ;
LATCHAW, R ;
BRENNER, R .
ARCHIVES OF NEUROLOGY, 1994, 51 (06) :585-594
[2]  
Cernocky Jan, 2007, IEEE INT C AC SPEECH, V4, pIV
[3]  
Eyben F., 2013, P 21 ACM INT C MULT, P835, DOI 10.1145/2502081.2502224
[4]   Linguistic Features Identify Alzheimer's Disease in Narrative Speech [J].
Fraser, Kathleen C. ;
Meltzer, Jed A. ;
Rudzicz, Frank .
JOURNAL OF ALZHEIMERS DISEASE, 2016, 49 (02) :407-422
[5]  
Fritsch J, 2019, INT CONF ACOUST SPEE, P5841, DOI 10.1109/ICASSP.2019.8682690
[6]  
Goodglass H., 2000, BOSTON DIAGNOSTIC AP
[7]   Detecting Alzheimer's Disease from Continuous Speech Using Language Models [J].
Guo, Zhiqiang ;
Ling, Zhenhua ;
Li, Yunxia .
JOURNAL OF ALZHEIMERS DISEASE, 2019, 70 (04) :1163-1174
[8]  
Karlekar S., 2018, NAACL HLT 2018 2018, V2, P701, DOI [10.18653/v1/n18-2110, DOI 10.18653/V1/N18-2110, 10.18653/v1/N18-2110]
[9]   Emotion Recognition from Human Speech Using Temporal Information and Deep Learning [J].
Kim, John W. ;
Saurous, Rif A. .
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, :937-940
[10]   An Attention Pooling based Representation Learning Method for Speech Emotion Recognition [J].
Li, Pengcheng ;
Song, Yan ;
McLoughlin, Ian ;
Guo, Wu ;
Dai, Lirong .
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, :3087-3091