Automatic detection and assessment of Alzheimer Disease using speech and language technologies in low-resource scenarios

被引：37

作者：

Pappagari, Raghavendra ^{[1
]}

Cho, Jaejin ^{[1
]}

Joshi, Sonal ^{[1
]}

Moro-Velazquez, Laureano ^{[1
]}

Zelasko, Piotr ^{[1
,2
]}

Villalba, Jesus ^{[1
,2
]}

Dehak, Najim ^{[1
,2
]}

机构：

[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA

来源：

INTERSPEECH 2021 | 2021年

关键词：

Alzheimer Disease; Automatic Speech Recognition; Mini-Mental Status Evaluation; DEMENTIA RECOGNITION; FEATURES; SYSTEM;

D O I：

10.21437/Interspeech.2021-1850

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

In this study, we analyze the use of speech and speaker recognition technologies and natural language processing to detect Alzheimer disease (AD) and estimate mini-mental status evaluation (MMSE) scores. We used speech recordings from Interspeech 2021 ADReSS(o) challenge dataset. Our work focuses on adapting state-of-the-art speaker recognition and language models individually and later collectively to examine their complementary behavior for the tasks. We used speech embedding techniques such as x-vectors and prosody features to characterize the speech signals. We also employed automatic speech recognition (ASR) with interpolated language models to obtain transcriptions used to fine-tune the BERT models that classify and assess the speakers. Our results indicate that the fusion of scores obtained from the multiple acoustic and linguistic models provides the best detection results, suggesting that they contain complementary information. A separate analysis of the models indicates that linguistic models outperform acoustic models in detection and prediction tasks. However, acoustic models can provide better results than linguistic models under certain circumstances due to the errors in ASR transcriptions, which indicates that the performance of linguistic models relies on the performance of ASRs. Our best models provide 84.51% accuracy in automatic detection of AD and 3.85 RMSE in MMSE prediction.

引用

页码：3825 / 3829

页数：5

共 38 条

[1]

[Anonymous], 2004, P LREC

[2] The case for consistent use of medical eponyms by eliminating possessive forms [J].

Ayesu, Kwabena ;

Nguyen, Brenda ;

Harris, Stephanie ;

Carlan, Steve .

JOURNAL OF THE MEDICAL LIBRARY ASSOCIATION, 2018, 106 (01) :127-129

[3] To BERT or Not To BERT: Comparing Speech and Language-based Approaches for Alzheimer's Disease Detection [J].

Balagopalan, Aparna ;

Eyre, Benjamin ;

Rudzicz, Frank ;

Novikova, Jekaterina .

INTERSPEECH 2020, 2020, :2167-2171

[4] A Comparison of Acoustic and Linguistics Methodologies for Alzheimer's Dementia Recognition [J].

Cummins, Nicholas ;

Pan, Yilin ;

Ren, Zhao ;

Fritsch, Julian ;

Nallanthighal, Venkata Srikanth ;

Christensen, Heidi ;

Blackburn, Daniel ;

Schuller, Bjorn W. ;

Magimai-Doss, Mathew ;

Strik, Helmer ;

Harma, Aki .

INTERSPEECH 2020, 2020, :2182-2186

[5]

Devlin J., 2018, arXiv:1810.04805

[6] Multiscale System for Alzheimer's Dementia Recognition through Spontaneous Speech [J].

Edwards, Erik ;

Dognin, Charles ;

Bollepalli, Bajibabu ;

Singh, Maneesh .

INTERSPEECH 2020, 2020, :2197-2201

[7] Language impairment in Alzheimer's disease and benefits of acetylcholinesterase inhibitors [J].

Ferris, Steven H. ;

Farlow, Martin .

CLINICAL INTERVENTIONS IN AGING, 2013, 8 :1007-1014

[8]

Gemmeke JF, 2017, INT CONF ACOUST SPEE, P776, DOI 10.1109/ICASSP.2017.7952261

[9]

Harnish S. M., 2018, OXFORD HDB APHASIA L

[10] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

← 1 2 3 4 →