Dual-Tree Complex Wavelet Transform for the Automatic Detection of the Common Cold Based on Speech Signals

被引:0
作者
Warule, Pankaj [1 ]
Chandratre, Snigdha [2 ]
Daware, Smita [3 ]
Mishra, Siba Prasad [2 ]
Deb, Suman [2 ]
机构
[1] Pravara Rural Engn Coll, Loni, Maharashtra, India
[2] Sardar Vallabhbhai Natl Inst Technol, Surat, Gujarat, India
[3] Shri Ramdeobaba Coll Engn & Management, Nagpur, Maharashtra, India
关键词
Cold speech; Deep neural network; Dual-tree complex wavelet transform; Transformer; ENTROPY;
D O I
10.1007/s00034-025-03041-9
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The acoustic and prosodic features of speech change in the presence of various health states. Biomedical engineering has enormous promise for developing non-invasive diagnostic technologies that use voice as a modality. The common cold is a highly prevalent sickness that affects a significant proportion of the global population throughout the year. The utilization of speech signals for the detection of the common cold has experienced a surge in popularity in recent times. In this study, the dual-tree complex wavelet transform (DTCWT) based new feature extraction technique is proposed for diagnosing common cold infection. First, we have employed the DTCWT to break down the speech signal into many sub-band coefficients. Then the features such as mean, variance, skewness, kurtosis, energy, approximate entropy, Renyi entropy, and permutation entropy are extracted from these sub-band coefficients. The URTIC database is utilized to assess the effectiveness of the proposed features. The classification results achieved using the transformer model show that the proposed algorithm detected cold from a speech sample with UAR of 68.66% and 64.52% on the develop and test set of the URTIC dataset. We have obtained comparable results with the state-of-the-art methods. The DTCWT captures subtle changes in speech signals, making it well-suited for detecting common cold symptoms. Its ability to provide both time-frequency localization and phase information enables it to discriminate between healthy and cold-affected speech patterns, leading to improved classification accuracy.
引用
收藏
页数:20
相关论文
共 37 条
  • [1] Hybrid LSTM-Transformer Model for Emotion Recognition From Speech Audio Files
    Andayani, Felicia
    Theng, Lau Bee
    Tsun, Mark Teekit
    Chua, Caslon
    [J]. IEEE ACCESS, 2022, 10 : 36018 - 36027
  • [2] End-to-End Deep Learning Framework for Speech Paralinguistics Detection Based on Perception Aware Spectrum
    Cai, Danwei
    Ni, Zhidong
    Liu, Wenbo
    Cai, Weicheng
    Li, Gang
    Li, Ming
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3452 - 3456
  • [3] Speech analysis for health: Current state-of-the-art and the increasing impact of deep learning
    Cummins, Nicholas
    Baird, Alice
    Schuller, Bjoern W.
    [J]. METHODS, 2018, 151 : 41 - 54
  • [4] Detection of Common Cold from Speech Signals using Deep Neural Network
    Deb, Suman
    Warule, Pankaj
    Nair, Amrita
    Sultan, Haider
    Dash, Rahul
    Krajewski, Jarek
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 42 (3) : 1707 - 1722
  • [5] Analysis and Classification of Cold Speech Using Variational Mode Decomposition
    Deb, Suman
    Dandapat, Samarendra
    Krajewski, Jarek
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2020, 11 (02) : 296 - 307
  • [6] Understanding the symptoms of the common cold and influenza
    Eccles, R
    [J]. LANCET INFECTIOUS DISEASES, 2005, 5 (11) : 718 - 725
  • [7] Using the Fisher Vector Approach for Cold Identification
    Egas-Lopez, Jose Vicente
    Gosztolya, Gabor
    [J]. ACTA CYBERNETICA, 2021, 25 (02): : 223 - 232
  • [8] Gosztolya G., 2017, Dnn-based feature extraction and classifier combination for child-directed speech, cold and snoring identification
  • [9] Huckvale M.A., 2017, It sounds like you have a cold! testing voice features for the interspeech 2017 computational paralinguistics cold challenge
  • [10] A Survey on Signal Processing Based Pathological Voice Detection Techniques
    Islam, Rumana
    Tarique, Mohammed
    Abdel-Raheem, Esam
    [J]. IEEE ACCESS, 2020, 8 : 66749 - 66776