Improving Speech-Based Dysarthria Detection using Multi-task Learning with Gradient Projection

被引：0

作者：

Xiang, Yan ^{[1
]}

Berisha, Visar ^{[1
,2
]}

Liss, Julie ^{[2
]}

Chakrabarti, Chaitali ^{[1
]}

机构：

[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA

[2] Arizona State Univ, Coll Hlth Solut, Tempe, AZ USA

来源：

INTERSPEECH 2024 | 2024年

关键词：

Dysarthria detection; speech processing; deep neural network; multi-task learning; DISEASE;

D O I：

10.21437/Interspeech.2024-1563

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speech analytic models based on deep learning are popular in clinical diagnostics. However, constraints on clinical data collection and sharing place limits on available dataset sizes, which adversely impacts trained model performance. Multi-task learning (MTL) has been utilized to mitigate the effect of limited sample size by jointly training on multiple tasks that are considered to be related. However, discrepancies between clinical and non-clinical tasks can reduce MTL efficiency and can even cause it to fail, especially when there are gradient conflicts. In this paper, we enhance the performance of dysarthria detection by using MTL with an auxiliary task of learning speaker embeddings. We propose a task-specific gradient projection method to overcome gradient conflicts. Our evaluation shows that the proposed MTL paradigm outperforms both single-task learning and conventional MTL under different data availability settings.

引用

页码：902 / 906

页数：5

共 50 条

[1] Multi-task gradient descent for multi-task learning
Lu Bai
Yew-Soon Ong
Tiantian He
Abhishek Gupta
Memetic Computing, 2020, 12 : 355 - 369
[2] Multi-task gradient descent for multi-task learning
Bai, Lu
Ong, Yew-Soon
He, Tiantian
Gupta, Abhishek
MEMETIC COMPUTING, 2020, 12 (04) : 355 - 369
[3] MULTI-TASK LEARNING IMPROVES SYNTHETIC SPEECH DETECTION
Mo, Yichuan
Wang, Shilin
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6392 - 6396
[4] Dysarthria severity classification using multi-head attention and multi-task learning
Joshy, Amlu Anna
Rajan, Rajeev
SPEECH COMMUNICATION, 2023, 147 : 1 - 11
[5] Speech Emotion Recognition based on Multi-Task Learning
Zhao, Huijuan
Han Zhijie
Wang, Ruchuan
2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 186 - 188
[6] Transformer-based transfer learning and multi-task learning for improving the performance of speech emotion recognition
Park, Sunchan
Kim, Hyung Soon
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 515 - 522
[7] IMPROVING SPEECH RECOGNITION IN REVERBERATION USING A ROOM-AWARE DEEP NEURAL NETWORK AND MULTI-TASK LEARNING
Giri, Ritwik
Seltzer, Michael L.
Droppo, Jasha
Yu, Dong
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5014 - 5018
[8] TO REVERSE THE GRADIENT OR NOT: AN EMPIRICAL COMPARISON OF ADVERSARIAL AND MULTI-TASK LEARNING IN SPEECH RECOGNITION
Adi, Yossi
Zeghidour, Neil
Collobert, Ronan
Usunier, Nicolas
Liptchinsky, Vitaliy
Synnaeve, Gabriel
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3742 - 3746
[9] A deep neural network based multi-task learning approach to hate speech detection
Kapil, Prashant
Ekbal, Asif
KNOWLEDGE-BASED SYSTEMS, 2020, 210 (210)
[10] HHSD: Hindi Hate Speech Detection Leveraging Multi-Task Learning
Kapil, Prashant
Kumari, Gitanjali
Ekbal, Asif
Pal, Santanu
Chatterjee, Arindam
Vinutha, B. N.
IEEE ACCESS, 2023, 11 : 101460 - 101473

← 1 2 3 4 5 →