Wave2Graph: Integrating spectral features and correlations for graph-based learning in sound waves

被引:0
作者
Hoang, Van-Truong [1 ]
Tran, Khanh-Tung [2 ]
Vu, Xuan-Son [3 ]
Nguyen, Duy-Khuong [1 ]
Bhuyan, Monowar [3 ]
Nguyen, Hoang D. [2 ]
机构
[1] FPT Software Co Ltd, AI Ctr, Hanoi, Vietnam
[2] Univ Coll Cork, Sch Comp Sci & Informat Technol, Cork, Ireland
[3] Umea Univ, Dept Comp Sci, Umea, Sweden
来源
AI OPEN | 2024年 / 5卷
基金
爱尔兰科学基金会;
关键词
Wave2Graph; Graph neural network; Correlation; Sound signal processing; Neural network architecture;
D O I
10.1016/j.aiopen.2024.08.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates a novel graph-based representation of sound waves inspired by the physical phenomenon of correlated vibrations. We propose a Wave2Graph framework for integrating multiple acoustic representations, including the spectrum of frequencies and correlations, into various neural computing architectures to achieve new state-of-the-art performances in sound classification. The capability and reliability of our end-to-end framework are evidently demonstrated in voice pathology for low-cost and non-invasive mass-screening of medical conditions, including respiratory illnesses and Alzheimer's Dementia. We conduct extensive experiments on multiple public benchmark datasets (ICBHI and ADReSSo) and our real-world dataset (IJSound: Respiratory disease detection using coughs and breaths). Wave2Graph framework consistently outperforms previous state-of-the-art methods with a large magnitude, up to 7.65% improvement, promising the usefulness of graph-based representation in signal processing and machine learning.
引用
收藏
页码:115 / 125
页数:11
相关论文
共 47 条
  • [1] Investigation of Voice Pathology Detection and Classification on Different Frequency Regions Using Correlation Functions
    Al-nasheri, Ahmed
    Muhammad, Ghulam
    Alsulaiman, Mansour
    Ali, Zulfiqar
    [J]. JOURNAL OF VOICE, 2017, 31 (01) : 3 - 15
  • [2] Baevski A, 2020, ADV NEUR IN, V33
  • [3] Bendat J.S., 2010, Random Data: Analysis and Measurement Procedures, V4th
  • [4] AN ALGORITHM FOR MACHINE CALCULATION OF COMPLEX FOURIER SERIES
    COOLEY, JW
    TUKEY, JW
    [J]. MATHEMATICS OF COMPUTATION, 1965, 19 (90) : 297 - &
  • [5] NEAREST NEIGHBOR PATTERN CLASSIFICATION
    COVER, TM
    HART, PE
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) : 21 - +
  • [6] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [7] Fusion of intelligent learning for COVID-19: A state-of-the-art review and analysis on real medical data
    Ding, Weiping
    Nayak, Janmenjoy
    Swapnarekha, H.
    Abraham, Ajith
    Naik, Bighnaraj
    Pelusi, Danilo
    [J]. NEUROCOMPUTING, 2021, 457 : 40 - 66
  • [8] Eyben F., 2010, P 18 ACM INT C MULT, P1459, DOI [DOI 10.1145/1873951.1874246, 10.1145/1873951.1874246]
  • [9] A Multi-Branch Deep Learning Network for Automated Detection of COVID-19
    Fakhry, Ahmed
    Jiang, Xinyi
    Xiao, Jaclyn
    Chaudhari, Gunvant
    Han, Asriel
    [J]. INTERSPEECH 2021, 2021, : 4139 - 4143
  • [10] MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis
    Hazarika, Devamanyu
    Zimmermann, Roger
    Poria, Soujanya
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1122 - 1131