Detection of dementia on voice recordings using deep learning: a Framingham Heart Study

被引:48
作者
Xue, Chonghua [1 ]
Karjadi, Cody [2 ,7 ,8 ]
Paschalidis, Ioannis Ch. [3 ,4 ,5 ,6 ]
Au, Rhoda [2 ,7 ,8 ,9 ,10 ]
Kolachalama, Vijaya B. [1 ,9 ,11 ,12 ]
机构
[1] Boston Univ, Sch Med, Dept Med, Sect Computat Biomed, 72 E Concord St,Evans 636, Boston, MA 02118 USA
[2] Boston Univ, Framingham Heart Study, Boston, MA 02118 USA
[3] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02118 USA
[4] Boston Univ, Dept Syst Engn, Boston, MA 02118 USA
[5] Boston Univ, Dept Biomed Engn, Boston, MA 02118 USA
[6] Boston Univ, Fac Comp & Data Sci, Boston, MA 02118 USA
[7] Boston Univ, Sch Med, Dept Anat & Neurobiol, Boston, MA 02118 USA
[8] Boston Univ, Sch Med, Dept Neurol, Boston, MA 02118 USA
[9] Boston Univ, Alzheimers Dis Ctr, Boston, MA 02118 USA
[10] Boston Univ, Sch Publ Hlth, Dept Epidemiol, Boston, MA 02118 USA
[11] Boston Univ, Dept Comp Sci, Boston, MA 02115 USA
[12] Boston Univ, Fac Comp & Data Sci, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
Dementia; Machine learning; Digital health; Voice recording; Neuropsychological testing; DEPRESSION; LANGUAGE; DISEASE;
D O I
10.1186/s13195-021-00888-3
中图分类号
R74 [神经病学与精神病学];
学科分类号
摘要
Background Identification of reliable, affordable, and easy-to-use strategies for detection of dementia is sorely needed. Digital technologies, such as individual voice recordings, offer an attractive modality to assess cognition but methods that could automatically analyze such data are not readily available. Methods and findings We used 1264 voice recordings of neuropsychological examinations administered to participants from the Framingham Heart Study (FHS), a community-based longitudinal observational study. The recordings were 73 min in duration, on average, and contained at least two speakers (participant and examiner). Of the total voice recordings, 483 were of participants with normal cognition (NC), 451 recordings were of participants with mild cognitive impairment (MCI), and 330 were of participants with dementia (DE). We developed two deep learning models (a two-level long short-term memory (LSTM) network and a convolutional neural network (CNN)), which used the audio recordings to classify if the recording included a participant with only NC or only DE and to differentiate between recordings corresponding to those that had DE from those who did not have DE (i.e., NDE (NC+MCI)). Based on 5-fold cross-validation, the LSTM model achieved a mean (+/- std) area under the receiver operating characteristic curve (AUC) of 0.740 +/- 0.017, mean balanced accuracy of 0.647 +/- 0.027, and mean weighted F1 score of 0.596 +/- 0.047 in classifying cases with DE from those with NC. The CNN model achieved a mean AUC of 0.805 +/- 0.027, mean balanced accuracy of 0.743 +/- 0.015, and mean weighted F1 score of 0.742 +/- 0.033 in classifying cases with DE from those with NC. For the task related to the classification of participants with DE from NDE, the LSTM model achieved a mean AUC of 0.734 +/- 0.014, mean balanced accuracy of 0.675 +/- 0.013, and mean weighted F1 score of 0.671 +/- 0.015. The CNN model achieved a mean AUC of 0.746 +/- 0.021, mean balanced accuracy of 0.652 +/- 0.020, and mean weighted F1 score of 0.635 +/- 0.031 in classifying cases with DE from those who were NDE. Conclusion This proof-of-concept study demonstrates that automated deep learning-driven processing of audio recordings of neuropsychological testing performed on individuals recruited within a community cohort setting can facilitate dementia screening.
引用
收藏
页数:15
相关论文
共 29 条
[1]   Developing a large scale population screening tool for the assessment of Parkinson's disease using telephone-quality voice [J].
Arora, Siddharth ;
Baghai-Ravary, Ladan ;
Tsanas, Athanasios .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 145 (05) :2871-2884
[2]   Investigating Voice as a Biomarker for Leucine-Rich Repeat Kinase 2-Associated Parkinson's Disease [J].
Arora, Siddharth ;
Visanji, Naomi P. ;
Mestre, Tiago A. ;
Tsanas, Athanasios ;
AlDakheel, Amaal ;
Connolly, Barbara S. ;
Gasca-Salas, Carmen ;
Kern, Drew S. ;
Jain, Jennifer ;
Slow, Elizabeth J. ;
Faust-Socher, Achinoam ;
Lang, Anthony E. ;
Little, Max A. ;
Marras, Connie .
JOURNAL OF PARKINSONS DISEASE, 2018, 8 (04) :503-510
[3]   How Technology Is Reshaping Cognitive Assessment: Lessons From the Framingham Heart Study [J].
Au, Rhoda ;
Piers, Ryan J. ;
Devine, Sherral .
NEUROPSYCHOLOGY, 2017, 31 (08) :846-861
[4]   A computer-aided MFCC-based HMM system for automatic auscultation [J].
Chauhan, Sunita ;
Wang, Ping ;
Lim, Chu Sing ;
Anantharaman, V. .
COMPUTERS IN BIOLOGY AND MEDICINE, 2008, 38 (02) :221-233
[5]   Heart sound classification based on improved MFCC features and convolutional recurrent neural networks [J].
Deng, Muqing ;
Meng, Tingting ;
Cao, Jiuwen ;
Wang, Shimin ;
Zhang, Jing ;
Fan, Huijie .
NEURAL NETWORKS, 2020, 130 :22-32
[6]   Linguistic markers predict onset of Alzheimer's disease [J].
Eyigoz, Elif ;
Mathur, Sachin ;
Santamaria, Mar ;
Cecchi, Guillermo ;
Naylor, Melissa .
ECLINICALMEDICINE, 2020, 28
[7]  
Gold Michael, 2018, Alzheimers Dement (N Y), V4, P234, DOI 10.1016/j.trci.2018.04.003
[8]  
Gonzalez G M, 1997, Cult Divers Ment Health, V3, P93, DOI 10.1037/1099-9809.3.2.93
[9]   Deep Learning-A Technology With the Potential to Transform Health Care [J].
Hinton, Geoffrey .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2018, 320 (11) :1101-1102
[10]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]