Using SincNet for Learning Pathological Voice Disorders

被引:9
|
作者
Hung, Chao-Hsiang [1 ]
Wang, Syu-Siang [1 ]
Wang, Chi-Te [2 ]
Fang, Shih-Hau [1 ]
机构
[1] Yuan Ze Univ, Dept Elect Engn, Taoyuan 320, Taiwan
[2] Far Eastern Mem Hosp, Dept Otolaryngol Head & Neck Surg, New Taipei 220, Taiwan
关键词
pathological voice; classification; sinc functions; convolutional neural network; SincNet;
D O I
10.3390/s22176634
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a bottleneck for promoting the classification or detection of voice-disorder systems, especially in this pandemic period. In this paper, we proposed using a series of learnable sinc functions to replace the very first layer of a commonly used CNN to develop an explainable SincNet system for classifying or detecting pathological voices. The applied sinc filters, a front-end signal processor in SincNet, are critical for constructing the meaningful layer and are directly used to extract the acoustic features for following networks to generate high-level voice information. We conducted our tests on three different Far Eastern Memorial Hospital voice datasets. From our evaluations, the proposed approach achieves the highest 7%-accuracy and 9%-sensitivity improvements from conventional methods and thus demonstrates superior performance in predicting input pathological waveforms of the SincNet system. More importantly, we intended to give possible explanations between the system output and the first-layer extracted speech features based on our evaluated results.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Continuous Speech for Improved Learning Pathological Voice Disorders
    Wang, Syu-Siang
    Wang, Chi-Te
    Lai, Chih-Chung
    Tsao, Yu
    Fang, Shih-Hau
    IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY, 2022, 3 : 25 - 33
  • [2] Pathological Voice Detection Using Joint Subsapce Transfer Learning
    Zhang, Yihua
    Qian, Jinyang
    Zhang, Xiaojun
    Xu, Yishen
    Tao, Zhi
    APPLIED SCIENCES-BASEL, 2022, 12 (16):
  • [3] Deep Learning Approaches for Pathological Voice Detection Using Heterogeneous Parameters
    Lee, JiYeoun
    Choi, Hee-Jin
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (08) : 1920 - 1923
  • [4] Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach
    Fang, Shih-Hau
    Tsao, Yu
    Hsiao, Min-Jing
    Chen, Ji-Ying
    Lai, Ying-Hui
    Lin, Feng-Chuan
    Wang, Chi-Te
    JOURNAL OF VOICE, 2019, 33 (05) : 634 - 641
  • [5] Ensemble and Multimodal Learning for Pathological Voice Classification
    Ariyanti, Whenty
    Hussain, Tassadaq
    Wang, Jia-Ching
    Wang, Chi-Tei
    Fang, Shih-Hau
    Tsao, Yu
    IEEE SENSORS LETTERS, 2021, 5 (07) : 1 - 4
  • [6] TRANSFER LEARNING USING RAW WAVEFORM SINCNET FOR ROBUST SPEAKER DIARIZATION
    Dubey, Harishchandra
    Sangwan, Abhijeet
    Hansen, John H. L.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6296 - 6300
  • [7] Experimental Evaluation of Deep Learning Methods for an Intelligent Pathological Voice Detection System Using the Saarbruecken Voice Database
    Lee, Ji-Yeoun
    APPLIED SCIENCES-BASEL, 2021, 11 (15):
  • [8] VOICE RANGE PROFILE AS A QUANTITATIVE MEASURE OF VOCAL FUNCTION IN SOME PATHOLOGICAL VOICE DISORDERS
    ORABI, A
    KOTBY, MN
    ELLA, MYA
    ELSADY, S
    FOLIA PHONIATRICA ET LOGOPAEDICA, 1995, 47 (02) : 95 - 95
  • [9] Vocal Tract Acoustic Measurements for Detection of Pathological Voice Disorders
    Mishra, Jyoti
    Sharma, R. K.
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2024, 33 (10)
  • [10] A New Approach for Detection of Pathological Voice Disorders with Reduced Parameters
    Ankishan, Haydar
    ELECTRICA, 2018, 18 (01): : 60 - 71