Sound recurrence analysis for acoustic scene classification

被引:0
作者
Abesser, Jakob [1 ]
Liang, Zhiwei [1 ,2 ]
Seeber, Bernhard [2 ]
机构
[1] Fraunhofer IDMT, Semant Mus Technol, Ehrenbergstr 31, D-98693 Ilmenau, Germany
[2] Tech Univ Munich, Audio Informat Proc, Theresienstr 90, D-80333 Munich, Germany
关键词
Acoustic scene classification; Sound recurrence analysis; Sound repetition patterns; Self-similarity matrix; Harmonic-percussive source separation; Result fusion; Ensemble models;
D O I
10.1186/s13636-024-00390-2
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In everyday life, people experience different soundscapes in which natural sounds, animal noises, and man-made sounds blend together. Although there have been several studies on the importance of recurring sound patterns in music and language, the relevance of this phenomenon in natural soundscapes is still largely unexplored. In this article, we study the repetition patterns of harmonic and transient sound events as potential cues for acoustic scene classification (ASC). In the first part of our study, our aim is to identify acoustic scene classes that exhibit characteristic sound repetition patterns concerning harmonic and transient sounds. We propose three metrics to measure the overall prevalence of sound repetitions as well as their repetition periods and temporal stability. In the second part, we evaluate three strategies to incorporate self-similarity matrices as an additional input feature to a convolutional neural network architecture for ASC. We observe the characteristic repetition of transient sounds in recordings of "park" and "street traffic" as well as harmonic sound repetitions in acoustic scene classes related to public transportation. In the ASC experiments, hybrid network architectures, which combine spectrogram features and features from sound recurrence analysis, show increased accuracy for those classes with prominent sound repetition patterns. Our findings provide additional perspective on the distinctions among acoustic scenes previously primarily ascribed in the literature to their spectral features.
引用
收藏
页数:15
相关论文
共 47 条
[1]   A Review of Deep Learning Based Methods for Acoustic Scene Classification [J].
Abesser, Jakob .
APPLIED SCIENCES-BASEL, 2020, 10 (06)
[2]  
[Anonymous], 2014, Acoustics-Soundscape-Part 1: Definition and Conceptual Framework
[3]   Acoustic Scene Classification [J].
Barchiesi, Daniele ;
Giannoulis, Dimitrios ;
Stowell, Dan ;
Plumbley, Mark D. .
IEEE SIGNAL PROCESSING MAGAZINE, 2015, 32 (03) :16-34
[4]   Deep Convolutional Neural Network with Scalogram for Audio Scene Modeling [J].
Chen, Hangting ;
Zhang, Pengyuan ;
Bai, Haichuan ;
Yuan, Qingsheng ;
Bao, Xiuguo ;
Yan, Yonghong .
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, :3304-3308
[5]  
Fedorishin Dennis, 2021, P DETECTION CLASSIFI, P216
[6]  
FitzGerald Derry., 2010, P INT C DIGITAL AUDI, P246
[7]   Auditory perception of self-similarity in water sounds [J].
Geffen, Maria N. ;
Gervain, Judit ;
Werker, Janet F. ;
Magnasco, Marcelo O. .
FRONTIERS IN INTEGRATIVE NEUROSCIENCE, 2011, 5
[8]  
Han Y., 2017, DCASE, P46
[9]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[10]   Audio Event-Relational Graph Representation Learning for Acoustic Scene Classification [J].
Hou, Yuanbo ;
Song, Siyang ;
Yu, Chuang ;
Wang, Wenwu ;
Botteldooren, Dick .
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 :1382-1386