Detection of breath sounds in speech: A deep learning approach

被引:0
|
作者
Arafath, K. Mohamed Ismail Yasar [1 ]
Routray, Aurobinda [1 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Kharagpur 721302, India
关键词
Breath sound detection; Breath annotation; Deep learning; Mel-spectrogram; Self-supervised learning; SIGNAL;
D O I
10.1016/j.engappai.2024.109808
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Breath sound detection from speech recordings has wide-ranging applications, from high-quality audio recordings to medical diagnostics. However, perceptual recognition of breath sounds for annotation is prone to errors, and breath sounds typically occupy only 5% of speech recordings, leading to significant class imbalance. Additionally, the limited availability of annotated data makes the application of deep learning (DL) methods challenging. This paper proposes the use of thermal and normal videos, alongside speech data, to mitigate annotation errors in breath sound detection. To address class imbalance, we leverage self- supervised learning (SSL), employing a jigsaw puzzle solver as a pretext task to augment training data and enhance model performance. The jigsaw puzzle solver helps address the class imbalance by creating a balanced task for pretraining, improving the performance of the downstream task. This work also uses convolutional neural networks (CNN) and bidirectional long short-term memory (BiLSTM) models to locate breath sounds in speech recordings accurately. The proposed SSL implementation achieves an F1-score of 96% in a speaker- independent configuration. The proposed algorithm has also been tested on publicly available audio recordings from YouTube1and the BiLSTM version is available for testing on Hugging Face.2
引用
收藏
页数:12
相关论文
共 50 条
  • [1] An Algorithm for Detection of Breath Sounds in Spontaneous Speech with Application to Speaker Recognition
    Dumpala, Sri Harsha
    Alluri, K. N. R. K. Raju
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 98 - 108
  • [2] Deep Learning Based Fusion Approach for Hate Speech Detection
    Zhou, Yanling
    Yang, Yanyan
    Liu, Han
    Liu, Xiufeng
    Savage, Nick
    IEEE ACCESS, 2020, 8 : 128923 - 128929
  • [3] THE EFFECTS OF BREATH SOUNDS ON THE PERCEPTION OF SYNTHETIC SPEECH
    WHALEN, DH
    HOEQUIST, CE
    SHEFFERT, SM
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1995, 97 (05): : 3147 - 3153
  • [4] Exhaled breath signal analysis for diabetes detection: an optimized deep learning approach
    Gade, Anita
    Vijaya Baskar, V.
    Panneerselvam, John
    COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING, 2024, 27 (04) : 443 - 458
  • [5] Deep feature fusion for hate speech detection: a transfer learning approach
    Vishwajeet Dwivedy
    Pradeep Kumar Roy
    Multimedia Tools and Applications, 2023, 82 : 36279 - 36301
  • [6] Deep feature fusion for hate speech detection: a transfer learning approach
    Dwivedy, Vishwajeet
    Roy, Pradeep Kumar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (23) : 36279 - 36301
  • [7] A Deep Learning Approach for Automatic Hate Speech Detection in the Saudi Twittersphere
    Alshalan, Raghad
    Al-Khalifa, Hend
    APPLIED SCIENCES-BASEL, 2020, 10 (23): : 1 - 16
  • [8] An effective algorithm for automatic detection and exact demarcation of breath sounds in speech and song signals
    Ruinskiy, Dima
    Lavner, Yizhar
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 838 - 850
  • [9] Automatic snoring sounds detection from sleep sounds based on deep learning
    Yanmei Jiang
    Jianxin Peng
    Xiaowen Zhang
    Physical and Engineering Sciences in Medicine, 2020, 43 : 679 - 689
  • [10] COVID-19 detection in cough, breath and speech using deep transfer learning and bottleneck features
    Pahar, Madhurananda
    Klopper, Marisa
    Warren, Robin
    Niesler, Thomas
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 141