Zero-shot Multi-task Cough Sound Analysis with Speech Foundation Model Embeddings

被引:0
|
作者
Laska, Brady [1 ]
Xi, Pengcheng [2 ,3 ]
Valdes, Julio J. [2 ]
Wallace, Bruce [1 ,4 ,5 ]
Goubran, Rafik [1 ]
机构
[1] Carleton Univ, Syst & Comp Engn, Ottawa, ON, Canada
[2] Natl Res Council Canada, Ottawa, ON, Canada
[3] Carleton Univ, Ottawa, ON, Canada
[4] Bruyere Res Inst, Ottawa, ON, Canada
[5] SAM3 Innovat Hub, Ottawa, ON, Canada
来源
2024 IEEE INTERNATIONAL SYMPOSIUM ON MEDICAL MEASUREMENTS AND APPLICATIONS, MEMEA 2024 | 2024年
关键词
smart home; aging in place; cough signal; speech foundation model;
D O I
10.1109/MEMEA60663.2024.10596816
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Supportive smart home systems with integrated personalized cough analysis can support independent living and aging in place by helping monitor the state of acute and chronic health conditions. The stages of recognizing coughs, associating them to an individual, and analyzing the cough characteristics have traditionally been handled independently, using task-specific networks or algorithms. In contrast, recent transformer-based neural network speech foundation models trained on internet-scale datasets have demonstrated strong performance across a wide range of tasks. Learning such a general-purpose cough representation has been hampered by the lack of large-scale cough-specific datasets. In this work we demonstrate that the embeddings from a speech foundation model (w2v BERT 2.0) can be used as a powerful multi-purpose cough representation. We show that cough information is well encoded in the model, despite it being trained on speech data with no cough-specific fine-tuning or adapters. Zero-shot linear classification on the cough embeddings achieves strong performance on cough/breathing/speech discrimination (100%), cougher verification (96.9%), cougher identification (84.4%), and wet/dry cough classification (93.8%) tasks. We also show that distance metrics between cough embeddings is meaningful and use that to conduct explainable analysis of an unlabelled sample with similarity-based retrieval from a labelled dataset. We note these capabilities emerge in the early layers of the network, and that the cough embeddings occupy a small region of the embedding space, motivating future work into lower-complexity cough-specific representations suitable for embedded cough analysis.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
    Oh, Junhyuk
    Singh, Satinder
    Lee, Honglak
    Kohli, Pushmeet
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [2] Multi-Task Zero-Shot Action Recognition with Prioritised Data Augmentation
    Xu, Xun
    Hospedales, Timothy M.
    Gong, Shaogang
    COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 343 - 359
  • [3] Canonical mean filter for almost zero-shot multi-task classification
    Li, Yong
    Wang, Heng
    Ye, Xiang
    APPLIED INTELLIGENCE, 2023, 53 (20) : 24422 - 24434
  • [4] Canonical mean filter for almost zero-shot multi-task classification
    Yong Li
    Heng Wang
    Xiang Ye
    Applied Intelligence, 2023, 53 : 24422 - 24434
  • [5] Zero-Shot Rationalization by Multi-Task Transfer Learning from Question Answering
    Kung, Po-Nien
    Yang, Tse-Hsuan
    Chen, Yi-Cheng
    Yin, Sheng-Siang
    Chen, Yun-Nung
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2187 - 2197
  • [6] Zero-Shot Rumor Detection via Meta Multi-Task Prompt Learning
    Shi, Yu
    Yu, Ning
    Sun, Yawei
    Liu, Jianyi
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2024, 47 (04): : 77 - 82
  • [7] Joint Embedding with Multi-Task Learning for Multi-Label Zero-Shot Action Recognition
    An, Rongqiao
    Miao, Zhenjiang
    Li, Qingyu
    PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 613 - 618
  • [8] Towards Zero-Shot Conditional Summarization with Adaptive Multi-Task Fine-Tuning
    Goodwin, Travis R.
    Savery, Max E.
    Demner-Fushman, Dina
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
  • [9] Maha Bhaashya at SemEval-2024 Task 6: Zero-Shot Multi-task Hallucination Detection
    Bhamidipati, Patanjali
    Malladi, Advaith
    Shrivastava, Manish
    Mamidi, Radhika
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1685 - 1689
  • [10] Speech Enhancement with Zero-Shot Model Selection
    Zezario, Ryandhimas E.
    Fuh, Chiou-Shann
    Wang, Hsin-Min
    Tsao, Yu
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 491 - 495