USING ACOUSTIC DEEP NEURAL NETWORK EMBEDDINGS TO DETECT MULTIPLE SCLEROSIS FROM SPEECH

被引：8

作者：

Gosztolya, Gabor ^{[1
,2
]}

Toth, Laszlo ^{[1
]}

Svindt, Veronika ^{[3
]}

Bona, Judit ^{[4
]}

Hoffmann, Ildiko ^{[3
,5
]}

机构：

[1] Univ Szeged, Inst Informat, Szeged, Hungary

[2] ELRN SZTE Res Grp Artificial Intelligence, Szeged, Hungary

[3] ELRN, Res Ctr Linguist, Budapest, Hungary

[4] Eotvos Lorand Univ, Dept Appl Linguist & Phonet, Budapest, Hungary

[5] Univ Szeged, Dept Linguist, Szeged, Hungary

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年

关键词：

Multiple Sclerosis; medical speech processing; Deep Neural Networks; embeddings; x-vectors; DYSARTHRIA; LANGUAGE;

D O I：

10.1109/ICASSP43922.2022.9746856

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Multiple sclerosis (MS) is a chronic inflammatory disease of the central nervous system. It affects cognitive and motor functions, and the limitation of executive functions can also manifest itself in speech production. Due to this, automatic speech analysis might serve as an effective technique for assessing MS, or for monitoring the status of the patient. However, choosing the features to be extracted from the recordings is not straightforward. In the past few years, general feature extractors such as i-vectors, d-vectors and x-vectors have found their way into automatic speech analysis. In this study we show that there is no need to employ a special neural network architecture such as x-vectors to calculate effective features, but (even more) indicative features can be derived on the basis of a standard Deep Neural Network acoustic model. From our results, these features could effectively be used to distinguish MS subjects from healthy controls, as we measured AUC scores up to 0.935. We found that classification performance depended only slightly on the choice of the hidden layer used to extract our features, but the speech task performed by the subject turned out to be an important factor.

引用

页码：6927 / 6931

页数：5

共 21 条

[1]

[Anonymous], 1995, THESIS

[2]

Cawley GC, 2010, J MACH LEARN RES, V11, P2079

[3] LIBSVM: A Library for Support Vector Machines [J].

Chang, Chih-Chung ;

Lin, Chih-Jen .

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)

[4] DYSARTHRIA IN MULTIPLE-SCLEROSIS [J].

DARLEY, FL ;

BROWN, JR ;

GOLDSTEIN, NP .

JOURNAL OF SPEECH AND HEARING RESEARCH, 1972, 15 (02) :229-+

[5] SUPPORT VECTOR MACHINES AND JOINT FACTOR ANALYSIS FOR SPEAKER VERIFICATION [J].

Dehak, Najim ;

Kenny, Patrick ;

Dehak, Reda ;

Glembek, Ondrej ;

Dumouchel, Pierre ;

Burget, Lukas ;

Hubeika, Valiantsina ;

Castaldo, Fabio .

2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, :4237-+

[6] Cognitive Processes Underlying Verbal Fluency in Multiple Sclerosis [J].

Delgado-Alvarez, Alfonso ;

Matias-Guiu, Jordi A. ;

Delgado-Alonso, Cristina ;

Hernandez-Lorenzo, Laura ;

Cortes-Martinez, Ana ;

Vidorreta, Lucia ;

Montero-Escribano, Paloma ;

Pytel, Vanesa ;

Matias-Guiu, Jorge .

FRONTIERS IN NEUROLOGY, 2021, 11

[7] Identifying Conflict Escalation and Primates by Using Ensemble X-vectors and Fisher Vector Features [J].

Egas-Lopez, Jose Vicente ;

Vetrab, Mercedes ;

Toth, Laszlo ;

Gosztolya, Gabor .

INTERSPEECH 2021, 2021, :476-480

[8]

Fitz GeraldJ. F., 1987, Australian Journal of Human Communication Disorders, V15, P15, DOI DOI 10.3109/ASL2.1987.15.ISSUE-2.02

[9]

Gosztolyal G, 2015, INT CONF ACOUST SPEE, P4570, DOI 10.1109/ICASSP.2015.7178836

[10] Speaker age classification and regression using i-vectors [J].

Grzybowska, Joanna ;

Kacprzak, Stanislaw .

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, :1402-1406

← 1 2 3 →