Using SincNet for Learning Pathological Voice Disorders

被引:9
|
作者
Hung, Chao-Hsiang [1 ]
Wang, Syu-Siang [1 ]
Wang, Chi-Te [2 ]
Fang, Shih-Hau [1 ]
机构
[1] Yuan Ze Univ, Dept Elect Engn, Taoyuan 320, Taiwan
[2] Far Eastern Mem Hosp, Dept Otolaryngol Head & Neck Surg, New Taipei 220, Taiwan
关键词
pathological voice; classification; sinc functions; convolutional neural network; SincNet;
D O I
10.3390/s22176634
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a bottleneck for promoting the classification or detection of voice-disorder systems, especially in this pandemic period. In this paper, we proposed using a series of learnable sinc functions to replace the very first layer of a commonly used CNN to develop an explainable SincNet system for classifying or detecting pathological voices. The applied sinc filters, a front-end signal processor in SincNet, are critical for constructing the meaningful layer and are directly used to extract the acoustic features for following networks to generate high-level voice information. We conducted our tests on three different Far Eastern Memorial Hospital voice datasets. From our evaluations, the proposed approach achieves the highest 7%-accuracy and 9%-sensitivity improvements from conventional methods and thus demonstrates superior performance in predicting input pathological waveforms of the SincNet system. More importantly, we intended to give possible explanations between the system output and the first-layer extracted speech features based on our evaluated results.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Pathological Voice Classifcation Using Local Discriminant Basis and Genetic Algorithm
    Hosseini, Pegah T.
    Almasganj, Farshad
    Darabad, Mansour R.
    2008 MEDITERRANEAN CONFERENCE ON CONTROL AUTOMATION, VOLS 1-4, 2008, : 1755 - +
  • [42] Complexity Analysis Using Nonuniform Embedding Techniques for Voice Pathological Discrimination
    Gomez Garcia, J. A.
    Godino Llorente, J. I.
    Castellanos-Dominguez, G.
    ADVANCES IN NONLINEAR SPEECH PROCESSING, 2011, 7015 : 262 - +
  • [43] Children's Voice and Voice Disorders
    McAllister, Anita
    Sjolander, Peta
    SEMINARS IN SPEECH AND LANGUAGE, 2013, 34 (02) : 71 - 79
  • [44] The Use of Deep Learning Software in the Detection of Voice Disorders: A Systematic Review
    Barlow, Joshua
    Sragi, Zara
    Rivera-Rivera, Gabriel
    Al-Awady, Abdurrahman
    Dasdogen, Umit
    Courey, Mark S.
    Kirke, Diana N.
    OTOLARYNGOLOGY-HEAD AND NECK SURGERY, 2024, 170 (06) : 1531 - 1543
  • [45] Voice Gender Recognition Using Deep Learning
    Buyukyilmaz, Mucahit
    Cibikdiken, Ali Osman
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON MODELING, SIMULATION AND OPTIMIZATION TECHNOLOGIES AND APPLICATIONS (MSOTA2016), 2016, 58 : 409 - 411
  • [46] Gender Recognition by Voice Using Machine Learning
    Bhatia, Rohit
    Singh, Nagendra Pratap
    ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2021, 2022, 1534 : 307 - 318
  • [47] Speaker Counting Model based on Transfer Learning from SincNet Bottleneck Layer
    Wang, Wei
    Seraj, Fatjon
    Meratnia, Nirvana
    Havinga, Paul J. M.
    2020 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS (PERCOM 2020), 2020,
  • [48] INTEGRATED VOICE ANALYZER FOR ACOUSTIC EVALUATION OF PATHOLOGICAL VOICE.
    Kikuchi, Yoshinobu
    Uchida, Satoshi
    Kasuya, Hideki
    1600, (E69):
  • [49] Discriminating Pathological Voice From Healthy Voice Using Cepstral Peak Prominence Smoothed Distribution in Sustained Vowel
    Castellana, Antonella
    Carullo, Alessio
    Corbellini, Simone
    Astolfi, Arianna
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2018, 67 (03) : 646 - 654
  • [50] PEDIATRIC VOICE DISORDERS
    MADDERN, BR
    CAMPBELL, TF
    STOOL, S
    OTOLARYNGOLOGIC CLINICS OF NORTH AMERICA, 1991, 24 (05) : 1125 - 1140