MMHFNet: Multi-modal and multi-layer hybrid fusion network for voice pathology detection

被引:14
作者
Mohammed, Hussein M. A. [1 ]
Omeroglu, Asli Nur [1 ]
Oral, Emin Argun [1 ,2 ]
机构
[1] Ataturk Univ, Dept Elect & Elect Engn, TR-25240 Yakutiye, Erzurum, Turkiye
[2] Ataturk Univ, High Performance Comp Applicat & Res Ctr, TR-25240 Yakutiye, Erzurum, Turkiye
关键词
Voice pathology detection; Multi-modal data fusion; Multi-layer fusion; Deep learning; CNN; LSTM; NEURAL-NETWORKS; HEALTH-CARE; INFORMATION; CLASSIFICATION; IDENTIFICATION;
D O I
10.1016/j.eswa.2023.119790
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic voice pathology detection using non-invasive techniques that utilize patients' speech and electroglot-tograph (EGG) signals play a vital role in diagnosis and early medical intervention. In this paper, a novel deep Multi-Modal and Multi-Layer Hybrid Fusion Network (MMHFNet) is proposed to improve the performance of non-invasive voice pathology detection systems. MMHFNet simultaneously incorporates complementary information of different modalities (speech and EGG signals). It also vertically combines the low-level features, extracted from shallow layers, and high-level features, extracted from deep layers, to take the full advantage of spatio-spectral information of different layers for multi-layer fusion. The features extracted by MMHFNet are then fed into an LSTM classification network to diagnose the voice pathology. Comprehensive experiments are conducted on the publicly available Saarbruecken Voice Database (SVD) to evaluate the performance of the proposed MMHFNet. This dataset is used in two manners; one using its all samples and the other with selected samples to form the largest balanced SVD dataset. Experimental results demonstrated that the proposed MMHFNet achieves accuracy rates of 91% and 96.05% for datasets with all and balanced samples, respectively.
引用
收藏
页数:13
相关论文
共 66 条
  • [1] Afyouni I., 2021, INFORM FUSION
  • [2] Voice Pathology Detection and Classification by Adopting Online Sequential Extreme Learning Machine
    Al-Dhief, Fahad Taha
    Baki, Marina Mat
    Latiff, Nurul Mu'azzah Abdul
    Abd Malik, Nik Noordini Nik
    Salim, Naseer Sabri
    Albader, Musatafa Abbas Abbood
    Mahyuddin, Nor Muzlifah
    Mohammed, Mazin Abed
    [J]. IEEE ACCESS, 2021, 9 : 77293 - 77306
  • [3] An Investigation of Multidimensional Voice Program Parameters in Three Different Databases for Voice Pathology Detection and Classification
    Al-nasheri, Ahmed
    Muhammad, Ghulam
    Alsulaiman, Mansour
    Ali, Zulfiqar
    Mesallam, Tamer A.
    Farahat, Mohamed
    Malki, Khalid H.
    Bencherif, Mohamed A.
    [J]. JOURNAL OF VOICE, 2017, 31 (01) : 113.e9 - 113.e18
  • [4] Investigation of Voice Pathology Detection and Classification on Different Frequency Regions Using Correlation Functions
    Al-nasheri, Ahmed
    Muhammad, Ghulam
    Alsulaiman, Mansour
    Ali, Zulfiqar
    [J]. JOURNAL OF VOICE, 2017, 31 (01) : 3 - 15
  • [5] Automatic Voice Pathology Monitoring Using Parallel Deep Models for Smart Healthcare
    Alhussein, Musaed
    Muhammad, Ghulam
    [J]. IEEE ACCESS, 2019, 7 : 46474 - 46479
  • [6] Voice Pathology Detection Using Deep Learning on Mobile Healthcare Framework
    Alhussein, Musaed
    Muhammad, Ghulam
    [J]. IEEE ACCESS, 2018, 6 : 41034 - 41041
  • [7] Voice pathology detection by using the deep network architecture
    Ankishan, Haydar
    Inam, Sitki Cagdas
    [J]. APPLIED SOFT COMPUTING, 2021, 106
  • [8] Automatic Detection of Pathological Voices Using Complexity Measures, Noise Parameters, and Mel-Cepstral Coefficients
    Arias-Londono, Julian D.
    Godino-Llorente, Juan I.
    Saenz-Lechon, Nicolas
    Osma-Ruiz, Victor
    Castellanos-Dominguez, German
    [J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2011, 58 (02) : 370 - 379
  • [9] A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets
    Bayoudh, Khaled
    Knani, Raja
    Hamdaoui, Faycal
    Mtibaa, Abdellatif
    [J]. VISUAL COMPUTER, 2022, 38 (08) : 2939 - 2970
  • [10] A Survey on Multimodal Data-Driven Smart Healthcare Systems: Approaches and Applications
    Cai, Qiong
    Wang, Hao
    Li, Zhenmin
    Liu, Xiao
    [J]. IEEE ACCESS, 2019, 7 : 133583 - 133599