Zero-shot Multi-task Cough Sound Analysis with Speech Foundation Model Embeddings

被引：0

作者：

Laska, Brady ^{[1
]}

Xi, Pengcheng ^{[2
,3
]}

Valdes, Julio J. ^{[2
]}

Wallace, Bruce ^{[1
,4
,5
]}

Goubran, Rafik ^{[1
]}

机构：

[1] Carleton Univ, Syst & Comp Engn, Ottawa, ON, Canada

[2] Natl Res Council Canada, Ottawa, ON, Canada

[3] Carleton Univ, Ottawa, ON, Canada

[4] Bruyere Res Inst, Ottawa, ON, Canada

[5] SAM3 Innovat Hub, Ottawa, ON, Canada

来源：

2024 IEEE INTERNATIONAL SYMPOSIUM ON MEDICAL MEASUREMENTS AND APPLICATIONS, MEMEA 2024 | 2024年

关键词：

smart home; aging in place; cough signal; speech foundation model;

D O I：

10.1109/MEMEA60663.2024.10596816

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

Supportive smart home systems with integrated personalized cough analysis can support independent living and aging in place by helping monitor the state of acute and chronic health conditions. The stages of recognizing coughs, associating them to an individual, and analyzing the cough characteristics have traditionally been handled independently, using task-specific networks or algorithms. In contrast, recent transformer-based neural network speech foundation models trained on internet-scale datasets have demonstrated strong performance across a wide range of tasks. Learning such a general-purpose cough representation has been hampered by the lack of large-scale cough-specific datasets. In this work we demonstrate that the embeddings from a speech foundation model (w2v BERT 2.0) can be used as a powerful multi-purpose cough representation. We show that cough information is well encoded in the model, despite it being trained on speech data with no cough-specific fine-tuning or adapters. Zero-shot linear classification on the cough embeddings achieves strong performance on cough/breathing/speech discrimination (100%), cougher verification (96.9%), cougher identification (84.4%), and wet/dry cough classification (93.8%) tasks. We also show that distance metrics between cough embeddings is meaningful and use that to conduct explainable analysis of an unlabelled sample with similarity-based retrieval from a labelled dataset. We note these capabilities emerge in the early layers of the network, and that the cough embeddings occupy a small region of the embedding space, motivating future work into lower-complexity cough-specific representations suitable for embedded cough analysis.

引用

页数：6

共 50 条

[1] Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
Oh, Junhyuk
Singh, Satinder
Lee, Honglak
Kohli, Pushmeet
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[2] Multi-Task Zero-Shot Action Recognition with Prioritised Data Augmentation
Xu, Xun
Hospedales, Timothy M.
Gong, Shaogang
COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 343 - 359
[3] Canonical mean filter for almost zero-shot multi-task classification
Li, Yong
Wang, Heng
Ye, Xiang
APPLIED INTELLIGENCE, 2023, 53 (20) : 24422 - 24434
[4] Canonical mean filter for almost zero-shot multi-task classification
Yong Li
Heng Wang
Xiang Ye
Applied Intelligence, 2023, 53 : 24422 - 24434
[5] Zero-Shot Rationalization by Multi-Task Transfer Learning from Question Answering
Kung, Po-Nien
Yang, Tse-Hsuan
Chen, Yi-Cheng
Yin, Sheng-Siang
Chen, Yun-Nung
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2187 - 2197
[6] Zero-Shot Rumor Detection via Meta Multi-Task Prompt Learning
Shi, Yu
Yu, Ning
Sun, Yawei
Liu, Jianyi
Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2024, 47 (04): : 77 - 82
[7] Joint Embedding with Multi-Task Learning for Multi-Label Zero-Shot Action Recognition
An, Rongqiao
Miao, Zhenjiang
Li, Qingyu
PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 613 - 618
[8] Towards Zero-Shot Conditional Summarization with Adaptive Multi-Task Fine-Tuning
Goodwin, Travis R.
Savery, Max E.
Demner-Fushman, Dina
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
[9] Maha Bhaashya at SemEval-2024 Task 6: Zero-Shot Multi-task Hallucination Detection
Bhamidipati, Patanjali
Malladi, Advaith
Shrivastava, Manish
Mamidi, Radhika
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1685 - 1689
[10] Speech Enhancement with Zero-Shot Model Selection
Zezario, Ryandhimas E.
Fuh, Chiou-Shann
Wang, Hsin-Min
Tsao, Yu
29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 491 - 495

← 1 2 3 4 5 →