Challenges of using longitudinal and cross-domain corpora on studies of pathological speech

被引：7

作者：

Botelho, Catarina ^{[1
,2
]}

Schultz, Tanja ^{[2
]}

Abad, Alberto ^{[1
]}

Trancoso, Isabel ^{[1
]}

机构：

[1] Univ Lisbon, INESC ID Inst Super Tecn, Lisbon, Portugal

[2] Univ Bremen, Cognit Syst Lab CSL, Bremen, Germany

来源：

INTERSPEECH 2022 | 2022年

关键词：

healthy speech; cross-corpora; clustering; DISEASE; ACCURACY;

D O I：

10.21437/Interspeech.2022-10995

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Several promising works have reported very exciting results in the field of speech in health, however there are still issues to address before deploying such systems into clinical applications. One of such issues is to ensure the generalisability and reliability of results. With this in mind, in this work, we perform a comparative analysis of healthy speech in two scenarios: (1) collected for six different datasets spoken in the same language, and (2) collected across different times in a single longitudinal corpus. We show that feature sets typically used for disease detection from speech (eGeMAPS, ComParE, ECAPA-TDNN embeddings and i-vectors) encode much information about the dataset or about changing recording conditions over time, in longitudinal studies. We support our results with classification results largely above chance level for both scenarios, and through unsupervised clustering experiments, where we observe that data naturally clusters according to dataset.

引用

页码：1921 / 1925

页数：5

共 29 条

[1]

Ablimit A., 2022, ICASSP

[2]

[Anonymous], 2005, INTERSPEECH-2005

[3]

[Anonymous], 2016, Multimorbidity: Technical Series on Safer Primary Care

[4] THE NATURAL-HISTORY OF ALZHEIMERS-DISEASE - DESCRIPTION OF STUDY COHORT AND ACCURACY OF DIAGNOSIS [J].

BECKER, JT ;

BOLLER, F ;

LOPEZ, OL ;

SAXTON, J ;

MCGONIGLE, KL ;

MOOSSY, J ;

HANIN, I ;

WOLFSON, SK ;

DETRE, K ;

HOLLAND, A ;

GUR, D ;

LATCHAW, R ;

BRENNER, R .

ARCHIVES OF NEUROLOGY, 1994, 51 (06) :585-594

[5] Transfer Learning and Data Augmentation Techniques to the COVID-19 Identification Tasks in ComParE 2021 [J].

Casanova, Edresson ;

Candido Jr, Arnaldo ;

Fernandes Jr, Ricardo Corso ;

Finger, Marcelo ;

Stefanel Gris, Lucas Rafael ;

Ponti, Moacir A. ;

Pinto da Silva, Daniel Peixoto .

INTERSPEECH 2021, 2021, :446-450

[6]

Botelho MC, 2019, INT CONF ACOUST SPEE, P5851, DOI 10.1109/ICASSP.2019.8682431

[7] THE IN-THE-WILD SPEECH MEDICAL CORPUS [J].

Correia, Joana ;

Teixeira, Francisco ;

Botelho, Catarina ;

Trancoso, Isabel ;

Raj, Bhiksha .

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, :6973-6977

[8] Speech analysis for health: Current state-of-the-art and the increasing impact of deep learning [J].

Cummins, Nicholas ;

Baird, Alice ;

Schuller, Bjoern W. .

METHODS, 2018, 151 :41-54

[9] Front-End Factor Analysis for Speaker Verification [J].

Dehak, Najim ;

Kenny, Patrick J. ;

Dehak, Reda ;

Dumouchel, Pierre ;

Ouellet, Pierre .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798

[10] ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification [J].

Desplanques, Brecht ;

Thienpondt, Jenthe ;

Demuynck, Kris .

INTERSPEECH 2020, 2020, :3830-3834

← 1 2 3 →