DETECTING ALZHEIMER'S DISEASE FROM SPEECH USING NEURAL NETWORKS WITH BOTTLENECK FEATURES AND DATA AUGMENTATION

被引：17

作者：

Liu, Zhaoci ^{[1
]}

Guo, Zhiqiang ^{[1
]}

Ling, Zhenhua ^{[1
]}

Li, Yunxia ^{[2
]}

机构：

[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China

[2] Tongji Univ, Shanghai Tongji Hosp, Sch Med, Shanghai, Peoples R China

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

Alzheimer's disease; speech analysis; neural networks; bottleneck features; data augmentation; AUTOMATIC DIAGNOSIS;

D O I：

10.1109/ICASSP39728.2021.9413566

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a method of detecting Alzheimer's disease (AD) from the spontaneous speech of subjects in a picture description task using neural networks. This method does not rely on the manual transcriptions and annotations of a subject's speech, but utilizes the bottleneck features extracted from audio using an ASR model. The neural network contains convolutional neural network (CNN) layers for local context modeling, bidirectional long short-term memory (BiLSTM) layers for global context modeling and an attention pooling layer for classification. Furthermore, a masking-based data augmentation method is designed to deal with the data scarcity problem. Experiments on the DementiaBank dataset show that the detection accuracy of our proposed method is 82.59%, which is better than the baseline method based on manually-designed acoustic features and support vector machines (SVM), and achieves the state-of-the-art performance of detecting AD using only audio data on this dataset.

引用

页码：7323 / 7327

页数：5

共 25 条

[1] THE NATURAL-HISTORY OF ALZHEIMERS-DISEASE - DESCRIPTION OF STUDY COHORT AND ACCURACY OF DIAGNOSIS [J].

BECKER, JT ;

BOLLER, F ;

LOPEZ, OL ;

SAXTON, J ;

MCGONIGLE, KL ;

MOOSSY, J ;

HANIN, I ;

WOLFSON, SK ;

DETRE, K ;

HOLLAND, A ;

GUR, D ;

LATCHAW, R ;

BRENNER, R .

ARCHIVES OF NEUROLOGY, 1994, 51 (06) :585-594

[2]

Cernocky Jan, 2007, IEEE INT C AC SPEECH, V4, pIV

[3]

Eyben F., 2013, P 21 ACM INT C MULT, P835, DOI 10.1145/2502081.2502224

[4] Linguistic Features Identify Alzheimer's Disease in Narrative Speech [J].

Fraser, Kathleen C. ;

Meltzer, Jed A. ;

Rudzicz, Frank .

JOURNAL OF ALZHEIMERS DISEASE, 2016, 49 (02) :407-422

[5]

Fritsch J, 2019, INT CONF ACOUST SPEE, P5841, DOI 10.1109/ICASSP.2019.8682690

[6]

Goodglass H., 2000, BOSTON DIAGNOSTIC AP

[7] Detecting Alzheimer's Disease from Continuous Speech Using Language Models [J].

Guo, Zhiqiang ;

Ling, Zhenhua ;

Li, Yunxia .

JOURNAL OF ALZHEIMERS DISEASE, 2019, 70 (04) :1163-1174

[8]

Karlekar S., 2018, NAACL HLT 2018 2018, V2, P701, DOI [10.18653/v1/n18-2110, DOI 10.18653/V1/N18-2110, 10.18653/v1/N18-2110]

[9] Emotion Recognition from Human Speech Using Temporal Information and Deep Learning [J].

Kim, John W. ;

Saurous, Rif A. .

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, :937-940

[10] An Attention Pooling based Representation Learning Method for Speech Emotion Recognition [J].

Li, Pengcheng ;

Song, Yan ;

McLoughlin, Ian ;

Guo, Wu ;

Dai, Lirong .

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, :3087-3091

← 1 2 3 →