Wave2Graph: Integrating spectral features and correlations for graph-based learning in sound waves

被引：0

作者：

Hoang, Van-Truong ^{[1
]}

Tran, Khanh-Tung ^{[2
]}

Vu, Xuan-Son ^{[3
]}

Nguyen, Duy-Khuong ^{[1
]}

Bhuyan, Monowar ^{[3
]}

Nguyen, Hoang D. ^{[2
]}

机构：

[1] FPT Software Co Ltd, AI Ctr, Hanoi, Vietnam

[2] Univ Coll Cork, Sch Comp Sci & Informat Technol, Cork, Ireland

[3] Umea Univ, Dept Comp Sci, Umea, Sweden

来源：

AI OPEN | 2024年 / 5卷

基金：

爱尔兰科学基金会;

关键词：

Wave2Graph; Graph neural network; Correlation; Sound signal processing; Neural network architecture;

D O I：

10.1016/j.aiopen.2024.08.004

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper investigates a novel graph-based representation of sound waves inspired by the physical phenomenon of correlated vibrations. We propose a Wave2Graph framework for integrating multiple acoustic representations, including the spectrum of frequencies and correlations, into various neural computing architectures to achieve new state-of-the-art performances in sound classification. The capability and reliability of our end-to-end framework are evidently demonstrated in voice pathology for low-cost and non-invasive mass-screening of medical conditions, including respiratory illnesses and Alzheimer's Dementia. We conduct extensive experiments on multiple public benchmark datasets (ICBHI and ADReSSo) and our real-world dataset (IJSound: Respiratory disease detection using coughs and breaths). Wave2Graph framework consistently outperforms previous state-of-the-art methods with a large magnitude, up to 7.65% improvement, promising the usefulness of graph-based representation in signal processing and machine learning.

引用

页码：115 / 125

页数：11

共 47 条

[11] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[12] Imran Ali, 2020, Inform Med Unlocked, V20, P100378, DOI 10.1016/j.imu.2020.100378
[13] New approaches for spectro-temporal feature extraction with applications to respiratory sound classification
Jin, F.
Sattar, F.
Goh, D. Y. T.
[J]. NEUROCOMPUTING, 2014, 123 : 362 - 371
[14] Kipf T. N., 2017, INT C LEARN REPR
[15] Pham L, 2022, IEEE ENG MED BIO, P4595, DOI 10.1109/EMBC48229.2022.9871440
[16] Graph-based semi-supervised one class support vector machine for detecting abnormal lung sounds
Lang, Rongling
Lu, Ruibo
Zhao, Chenqian
Qin, Honglei
Liu, Guodong
[J]. APPLIED MATHEMATICS AND COMPUTATION, 2020, 364
[17] Content-based classification of breath sound with enhanced features
Lei, Baiying
Rahman, Shah Atiqur
Song, Insu
[J]. NEUROCOMPUTING, 2014, 141 : 139 - 147
[18] LungAttn: advanced lung sound classification using attention mechanism with dual TQWT and triple STFT spectrogram
Li, Jizuo
Yuan, Jiajun
Wang, Hansong
Liu, Shijian
Guo, Qianyu
Ma, Yi
Li, Yongfu
Zhao, Liebin
Wang, Guoxing
[J]. PHYSIOLOGICAL MEASUREMENT, 2021, 42 (10)
[19] Loshchilov I., 2017, INT C LEARNING REPRE, P1
[20] Detecting cognitive decline using speech only: The ADReSSo Challenge
Luz, Saturnino
Haider, Fasih
de la Fuente, Sofia
Fromm, Davida
MacWhinney, Brian
[J]. INTERSPEECH 2021, 2021, : 3780 - 3784

← 1 2 3 4 5 →