Wave2Graph: Integrating spectral features and correlations for graph-based learning in sound waves

被引:0
作者
Hoang, Van-Truong [1 ]
Tran, Khanh-Tung [2 ]
Vu, Xuan-Son [3 ]
Nguyen, Duy-Khuong [1 ]
Bhuyan, Monowar [3 ]
Nguyen, Hoang D. [2 ]
机构
[1] FPT Software Co Ltd, AI Ctr, Hanoi, Vietnam
[2] Univ Coll Cork, Sch Comp Sci & Informat Technol, Cork, Ireland
[3] Umea Univ, Dept Comp Sci, Umea, Sweden
来源
AI OPEN | 2024年 / 5卷
基金
爱尔兰科学基金会;
关键词
Wave2Graph; Graph neural network; Correlation; Sound signal processing; Neural network architecture;
D O I
10.1016/j.aiopen.2024.08.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates a novel graph-based representation of sound waves inspired by the physical phenomenon of correlated vibrations. We propose a Wave2Graph framework for integrating multiple acoustic representations, including the spectrum of frequencies and correlations, into various neural computing architectures to achieve new state-of-the-art performances in sound classification. The capability and reliability of our end-to-end framework are evidently demonstrated in voice pathology for low-cost and non-invasive mass-screening of medical conditions, including respiratory illnesses and Alzheimer's Dementia. We conduct extensive experiments on multiple public benchmark datasets (ICBHI and ADReSSo) and our real-world dataset (IJSound: Respiratory disease detection using coughs and breaths). Wave2Graph framework consistently outperforms previous state-of-the-art methods with a large magnitude, up to 7.65% improvement, promising the usefulness of graph-based representation in signal processing and machine learning.
引用
收藏
页码:115 / 125
页数:11
相关论文
共 47 条
  • [11] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [12] Imran Ali, 2020, Inform Med Unlocked, V20, P100378, DOI 10.1016/j.imu.2020.100378
  • [13] New approaches for spectro-temporal feature extraction with applications to respiratory sound classification
    Jin, F.
    Sattar, F.
    Goh, D. Y. T.
    [J]. NEUROCOMPUTING, 2014, 123 : 362 - 371
  • [14] Kipf T. N., 2017, INT C LEARN REPR
  • [15] Pham L, 2022, IEEE ENG MED BIO, P4595, DOI 10.1109/EMBC48229.2022.9871440
  • [16] Graph-based semi-supervised one class support vector machine for detecting abnormal lung sounds
    Lang, Rongling
    Lu, Ruibo
    Zhao, Chenqian
    Qin, Honglei
    Liu, Guodong
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2020, 364
  • [17] Content-based classification of breath sound with enhanced features
    Lei, Baiying
    Rahman, Shah Atiqur
    Song, Insu
    [J]. NEUROCOMPUTING, 2014, 141 : 139 - 147
  • [18] LungAttn: advanced lung sound classification using attention mechanism with dual TQWT and triple STFT spectrogram
    Li, Jizuo
    Yuan, Jiajun
    Wang, Hansong
    Liu, Shijian
    Guo, Qianyu
    Ma, Yi
    Li, Yongfu
    Zhao, Liebin
    Wang, Guoxing
    [J]. PHYSIOLOGICAL MEASUREMENT, 2021, 42 (10)
  • [19] Loshchilov I., 2017, INT C LEARNING REPRE, P1
  • [20] Detecting cognitive decline using speech only: The ADReSSo Challenge
    Luz, Saturnino
    Haider, Fasih
    de la Fuente, Sofia
    Fromm, Davida
    MacWhinney, Brian
    [J]. INTERSPEECH 2021, 2021, : 3780 - 3784