Feature extraction strategies in deep learning based acoustic event detection

被引：0

作者：

Espi, Miguel ^{[1
]}

Fujimoto, Masakiyo ^{[1
]}

Kinoshita, Keisuke ^{[1
]}

Nakatani, Tomohiro ^{[1
]}

机构：

[1] NTT Corp, NTT Commun Sci Labs, Tokyo, Japan

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

acoustic event detection; spectro-temporal locality; multi-resolution; convolutional neural networks;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Non-speech acoustic events are significantly different between them, and usually require access to detail rich features. That is why directly modeling a real spectrogram can provide a significant advantage, instead of using predefined features that usually compress and downsample detail as typically done in speech recognition. This paper focuses on the importance of feature extraction for deep learning based acoustic event detection, and more specifically on exploiting local spectro-temporal features of sounds. We do this in two ways: (1) outside the model, using multiple resolution spectrogram simultaneously based on the fact that there is a time-frequency detail trade-off that depends on the resolution with which a spectrogram is computed (e.g. 'steps' would require a finer time resolution, while sounds that span many frequencies require finer frequency detail); and (2), with a model that implicitly exploits locality, convolutional neural networks, which are a state-of-the-art 2D feature extraction model. An experimental evaluation shows that the presented approaches outperform state-of-the-art deep learning baseline with a noticeable gain in the CNN case, and provides insights regarding CNN-based spectrogram characterization.

引用

页码：2922 / 2926

页数：5

共 50 条

[1] Deep learning based latent feature extraction for intrusion detection
Mighan, Soosan Naderi
Kahani, Mohsen
26TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2018), 2018, : 1511 - 1516
[2] Wind Power Ramp Event Forecasting Based on Feature Extraction and Deep Learning
Han, Li
Qiao, Yan
Li, Mengjie
Shi, Liping
ENERGIES, 2020, 13 (23)
[3] Multi Frame Size Feature Extraction for Acoustic Event Detection
Peng, Liqun
Yang, Deshun
Chen, Xiaoou
2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
[4] Feature extraction of underwater target acoustic signals based on deep manifold learning
Zhou, Yu
Wang, Jin
Teng, Fei
Pan, Bisheng
Wang, Yourui
Lei, Yingke
Zhendong yu Chongji/Journal of Vibration and Shock, 2024, 43 (09): : 50 - 59
[5] Semantic feature extraction based on subspace learning with temporal constraints for acoustic event recognition
Shi, Qiuying
Han, Jiqing
DIGITAL SIGNAL PROCESSING, 2021, 110
[6] Common subspace learning based semantic feature extraction method for acoustic event recognition
Shi, Qiuying
Deng, Shiwen
Han, Jiqing
APPLIED ACOUSTICS, 2022, 190
[7] Deep learning based text detection using resnet for feature extraction
Li-Kun Huang
Hsiao-Ting Tseng
Chen-Chiung Hsieh
Chih-Sin Yang
Multimedia Tools and Applications, 2023, 82 : 46871 - 46903
[8] Deep learning based text detection using resnet for feature extraction
Huang, Li-Kun
Tseng, Hsiao-Ting
Hsieh, Chen-Chiung
Yang, Chih-Sin
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (30) : 46871 - 46903
[9] Network intrusion detection method based on deep learning feature extraction
Song Y.
Hou B.
Cai Z.
Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2021, 49 (02): : 115 - 120
[10] A Survey: Neural Network-Based Deep Learning for Acoustic Event Detection
Xianjun Xia
Roberto Togneri
Ferdous Sohel
Yuanjun Zhao
Defeng Huang
Circuits, Systems, and Signal Processing, 2019, 38 : 3433 - 3453

← 1 2 3 4 5 →