Multimodal deep learning for dementia classification using text and audio

被引:1
作者
Lin, Kaiying [1 ,2 ]
Washington, Peter Y. [1 ]
机构
[1] Univ Hawaii, Dept Informat & Comp Sci, Honolulu, HI 96822 USA
[2] Univ Hawaii, Dept Linguist, Honolulu, HI 96822 USA
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
基金
美国国家科学基金会;
关键词
ALZHEIMERS-DISEASE;
D O I
10.1038/s41598-024-64438-1
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Dementia is a progressive neurological disorder that affects the daily lives of older adults, impacting their verbal communication and cognitive function. Early diagnosis is important to enhance the lifespan and quality of life for affected individuals. Despite its importance, diagnosing dementia is a complex process. Automated machine learning solutions involving multiple types of data have the potential to improve the process of automated dementia screening. In this study, we build deep learning models to classify dementia cases from controls using the Pitt Cookie Theft dataset from DementiaBank, a database of short participant responses to the structured task of describing a picture of a cookie theft. We fine-tune Wav2vec and Word2vec baseline models to make binary predictions of dementia from audio recordings and text transcripts, respectively. We conduct experiments with four versions of the dataset: (1) the original data, (2) the data with short sentences removed, (3) text-based augmentation of the original data, and (4) text-based augmentation of the data with short sentences removed. Our results indicate that synonym-based text data augmentation generally enhances the performance of models that incorporate the text modality. Without data augmentation, models using the text modality achieve around 60% accuracy and 70% AUROC scores, and with data augmentation, the models achieve around 80% accuracy and 90% AUROC scores. We do not observe significant improvements in performance with the addition of audio or timestamp information into the model. We include a qualitative error analysis of the sentences that are misclassified under each study condition. This study provides preliminary insights into the effects of both text-based data augmentation and multimodal deep learning for automated dementia classification.
引用
收藏
页数:10
相关论文
共 35 条
  • [1] [Anonymous], 2015, Keras
  • [2] Diagnosis and Management of Dementia: Review
    Arvanitakis, Zoe
    Shah, Raj C.
    Bennett, David A.
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2019, 322 (16): : 1589 - 1599
  • [3] Baevski A, 2020, ADV NEUR IN, V33
  • [4] To BERT or Not To BERT: Comparing Speech and Language-based Approaches for Alzheimer's Disease Detection
    Balagopalan, Aparna
    Eyre, Benjamin
    Rudzicz, Frank
    Novikova, Jekaterina
    [J]. INTERSPEECH 2020, 2020, : 2167 - 2171
  • [5] THE NATURAL-HISTORY OF ALZHEIMERS-DISEASE - DESCRIPTION OF STUDY COHORT AND ACCURACY OF DIAGNOSIS
    BECKER, JT
    BOLLER, F
    LOPEZ, OL
    SAXTON, J
    MCGONIGLE, KL
    MOOSSY, J
    HANIN, I
    WOLFSON, SK
    DETRE, K
    HOLLAND, A
    GUR, D
    LATCHAW, R
    BRENNER, R
    [J]. ARCHIVES OF NEUROLOGY, 1994, 51 (06) : 585 - 594
  • [6] Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study
    Chi, Nathan A.
    Washington, Peter
    Kline, Aaron
    Husic, Arman
    Hou, Cathy
    He, Chloe
    Dunlap, Kaitlyn
    Wall, Dennis P.
    [J]. JMIR PEDIATRICS AND PARENTING, 2022, 5 (02):
  • [7] Towards Computer-Based Automated Screening of Dementia Through Spontaneous Speech
    Chlasta, Karol
    Wolk, Krzysztof
    [J]. FRONTIERS IN PSYCHOLOGY, 2021, 11
  • [8] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
  • [9] Crossing the "Cookie Theft " Corpus Chasm: Applying What BERT Learns From Outside Data to the ADReSS Challenge Dementia Detection Task
    Guo, Yue
    Li, Changye
    Roan, Carol
    Pakhomov, Serguei
    Cohen, Trevor
    [J]. FRONTIERS IN COMPUTER SCIENCE, 2021, 3
  • [10] Guo Z., 2020, P 28 INT C COMP LING, P6161, DOI [10.18653/v1/2020.coling-main.542, DOI 10.18653/V1/2020.COLING-MAIN.542]