Deep Learning and Artificial Intelligence Applied to Model Speech and Language in Parkinson's Disease

被引：10

作者：

Escobar-Grisales, Daniel ^{[1
]}

Rios-Urrego, Cristian David ^{[1
]}

Orozco-Arroyave, Juan Rafael ^{[1
,2
]}

机构：

[1] Univ Antioquia, Fac Engn, GITA Lab, Medellin 050010, Colombia

[2] Univ Erlangen Nurnberg, LME Lab, D-91054 Erlangen, Germany

来源：

DIAGNOSTICS | 2023年 / 13卷 / 13期

关键词：

Parkinson's disease; natural language processing; speech processing; convolutional neural networks; Wav2Vec; word embeddings; DISCOURSE;

D O I：

10.3390/diagnostics13132163

中图分类号：

R5 [内科学];

学科分类号：

1002 ; 100201 ;

摘要：

Parkinson's disease (PD) is the second most prevalent neurodegenerative disorder in the world, and it is characterized by the production of different motor and non-motor symptoms which negatively affect speech and language production. For decades, the research community has been working on methodologies to automatically model these biomarkers to detect and monitor the disease; however, although speech impairments have been widely explored, language remains underexplored despite being a valuable source of information, especially to assess cognitive impairments associated with non-motor symptoms. This study proposes the automatic assessment of PD patients using different methodologies to model speech and language biomarkers. One-dimensional and two-dimensional convolutional neural networks (CNNs), along with pre-trained models such as Wav2Vec 2.0, BERT, and BETO, were considered to classify PD patients vs. Healthy Control (HC) subjects. The first approach consisted of modeling speech and language independently. Then, the best representations from each modality were combined following early, joint, and late fusion strategies. The results show that the speech modality yielded an accuracy of up to 88%, thus outperforming all language representations, including the multi-modal approach. These results suggest that speech representations better discriminate PD patients and HC subjects than language representations. When analyzing the fusion strategies, we observed that changes in the time span of the multi-modal representation could produce a significant loss of information in the speech modality, which was likely linked to a decrease in accuracy in the multi-modal experiments. Further experiments are necessary to validate this claim with other fusion methods using different time spans.

引用

页数：16

共 45 条

[1] High-Level Language Production in Parkinson's Disease: A Review
Altmann, Lori J. P.
Troche, Michelle S.
[J]. PARKINSONS DISEASE, 2011, 2011
[2] Depression assessment in people with Parkinson's disease: The combination of acoustic features and natural language processing
Andrea Perez-Toro, Paula
Arias-Vergara, Tomas
Klumpp, Philipp
Camilo Vasquez-Correa, Juan
Schuster, Maria
Noeth, Elmar
Rafael Orozco-Arroyave, Juan
[J]. SPEECH COMMUNICATION, 2022, 145 : 10 - 20
[3] Baevski A, 2020, ADV NEUR IN, V33
[4] Multimodal Neurocognitive Markers of Naturalistic Discourse Typify Diverse Neurodegenerative Diseases
Birba, Agustina
Fittipaldi, Sol
Cediel Escobar, Judith C.
Gonzalez Campo, Cecilia
Legaz, Agustina
Galiani, Agostina
Diaz Rivera, Mariano N.
Martorell Caro, Miquel
Alifano, Florencia
Pina-Escudero, Stefanie D.
Cardona, Juan Felipe
Neely, Alejandra
Forno, Gonzalo
Carpinella, Mari
Slachevsky, Andrea
Serrano, Cecilia
Sedeno, Lucas
Ibanez, Agustin
Garcia, Adolfo M.
[J]. CEREBRAL CORTEX, 2022, 32 (16) : 3377 - 3391
[5] Losing ground: Frontostriatal atrophy disrupts language embodiment in Parkinson's and Huntington's disease
Birba, Agustina
Garcia-Cordero, Indira
Kozono, Giselle
Legaz, Agustina
Ibanez, Agustin
Sedeno, Lucas
Garcia, Adolfo M.
[J]. NEUROSCIENCE AND BIOBEHAVIORAL REVIEWS, 2017, 80 : 673 - 687
[6] Canete J., 2020, PML4DC ICLR 2020
[7] Supervisory and routine processes in noun and verb generation in nondemented patients with Parkinson's disease
Crescentini, Cristiano
Mondolo, Federica
Biasutti, Emanuele
Shallice, Tim
[J]. NEUROPSYCHOLOGIA, 2008, 46 (02) : 434 - 447
[8] de Rijk MC, 2000, NEUROLOGY, V54, pS21
[9] Dhir N., 2020, P CONLL ASS COMPUTAT, P578, DOI 10.18653/v1/2020.conll-1.47
[10] Parkinson's detection based on combined CNN and LSTM using enhanced speech signals with Variational mode decomposition
Er, Mehmet Bilal
Isik, Esme
Isik, Ibrahim
[J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 70

← 1 2 3 4 5 →