Deep Learning and Artificial Intelligence Applied to Model Speech and Language in Parkinson's Disease

被引:10
作者
Escobar-Grisales, Daniel [1 ]
Rios-Urrego, Cristian David [1 ]
Orozco-Arroyave, Juan Rafael [1 ,2 ]
机构
[1] Univ Antioquia, Fac Engn, GITA Lab, Medellin 050010, Colombia
[2] Univ Erlangen Nurnberg, LME Lab, D-91054 Erlangen, Germany
关键词
Parkinson's disease; natural language processing; speech processing; convolutional neural networks; Wav2Vec; word embeddings; DISCOURSE;
D O I
10.3390/diagnostics13132163
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Parkinson's disease (PD) is the second most prevalent neurodegenerative disorder in the world, and it is characterized by the production of different motor and non-motor symptoms which negatively affect speech and language production. For decades, the research community has been working on methodologies to automatically model these biomarkers to detect and monitor the disease; however, although speech impairments have been widely explored, language remains underexplored despite being a valuable source of information, especially to assess cognitive impairments associated with non-motor symptoms. This study proposes the automatic assessment of PD patients using different methodologies to model speech and language biomarkers. One-dimensional and two-dimensional convolutional neural networks (CNNs), along with pre-trained models such as Wav2Vec 2.0, BERT, and BETO, were considered to classify PD patients vs. Healthy Control (HC) subjects. The first approach consisted of modeling speech and language independently. Then, the best representations from each modality were combined following early, joint, and late fusion strategies. The results show that the speech modality yielded an accuracy of up to 88%, thus outperforming all language representations, including the multi-modal approach. These results suggest that speech representations better discriminate PD patients and HC subjects than language representations. When analyzing the fusion strategies, we observed that changes in the time span of the multi-modal representation could produce a significant loss of information in the speech modality, which was likely linked to a decrease in accuracy in the multi-modal experiments. Further experiments are necessary to validate this claim with other fusion methods using different time spans.
引用
收藏
页数:16
相关论文
共 45 条
  • [1] High-Level Language Production in Parkinson's Disease: A Review
    Altmann, Lori J. P.
    Troche, Michelle S.
    [J]. PARKINSONS DISEASE, 2011, 2011
  • [2] Depression assessment in people with Parkinson's disease: The combination of acoustic features and natural language processing
    Andrea Perez-Toro, Paula
    Arias-Vergara, Tomas
    Klumpp, Philipp
    Camilo Vasquez-Correa, Juan
    Schuster, Maria
    Noeth, Elmar
    Rafael Orozco-Arroyave, Juan
    [J]. SPEECH COMMUNICATION, 2022, 145 : 10 - 20
  • [3] Baevski A, 2020, ADV NEUR IN, V33
  • [4] Multimodal Neurocognitive Markers of Naturalistic Discourse Typify Diverse Neurodegenerative Diseases
    Birba, Agustina
    Fittipaldi, Sol
    Cediel Escobar, Judith C.
    Gonzalez Campo, Cecilia
    Legaz, Agustina
    Galiani, Agostina
    Diaz Rivera, Mariano N.
    Martorell Caro, Miquel
    Alifano, Florencia
    Pina-Escudero, Stefanie D.
    Cardona, Juan Felipe
    Neely, Alejandra
    Forno, Gonzalo
    Carpinella, Mari
    Slachevsky, Andrea
    Serrano, Cecilia
    Sedeno, Lucas
    Ibanez, Agustin
    Garcia, Adolfo M.
    [J]. CEREBRAL CORTEX, 2022, 32 (16) : 3377 - 3391
  • [5] Losing ground: Frontostriatal atrophy disrupts language embodiment in Parkinson's and Huntington's disease
    Birba, Agustina
    Garcia-Cordero, Indira
    Kozono, Giselle
    Legaz, Agustina
    Ibanez, Agustin
    Sedeno, Lucas
    Garcia, Adolfo M.
    [J]. NEUROSCIENCE AND BIOBEHAVIORAL REVIEWS, 2017, 80 : 673 - 687
  • [6] Canete J., 2020, PML4DC ICLR 2020
  • [7] Supervisory and routine processes in noun and verb generation in nondemented patients with Parkinson's disease
    Crescentini, Cristiano
    Mondolo, Federica
    Biasutti, Emanuele
    Shallice, Tim
    [J]. NEUROPSYCHOLOGIA, 2008, 46 (02) : 434 - 447
  • [8] de Rijk MC, 2000, NEUROLOGY, V54, pS21
  • [9] Dhir N., 2020, P CONLL ASS COMPUTAT, P578, DOI 10.18653/v1/2020.conll-1.47
  • [10] Parkinson's detection based on combined CNN and LSTM using enhanced speech signals with Variational mode decomposition
    Er, Mehmet Bilal
    Isik, Esme
    Isik, Ibrahim
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 70