Feature extraction strategies in deep learning based acoustic event detection

被引:0
|
作者
Espi, Miguel [1 ]
Fujimoto, Masakiyo [1 ]
Kinoshita, Keisuke [1 ]
Nakatani, Tomohiro [1 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Tokyo, Japan
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
acoustic event detection; spectro-temporal locality; multi-resolution; convolutional neural networks;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Non-speech acoustic events are significantly different between them, and usually require access to detail rich features. That is why directly modeling a real spectrogram can provide a significant advantage, instead of using predefined features that usually compress and downsample detail as typically done in speech recognition. This paper focuses on the importance of feature extraction for deep learning based acoustic event detection, and more specifically on exploiting local spectro-temporal features of sounds. We do this in two ways: (1) outside the model, using multiple resolution spectrogram simultaneously based on the fact that there is a time-frequency detail trade-off that depends on the resolution with which a spectrogram is computed (e.g. 'steps' would require a finer time resolution, while sounds that span many frequencies require finer frequency detail); and (2), with a model that implicitly exploits locality, convolutional neural networks, which are a state-of-the-art 2D feature extraction model. An experimental evaluation shows that the presented approaches outperform state-of-the-art deep learning baseline with a noticeable gain in the CNN case, and provides insights regarding CNN-based spectrogram characterization.
引用
收藏
页码:2922 / 2926
页数:5
相关论文
共 50 条
  • [1] Deep learning based latent feature extraction for intrusion detection
    Mighan, Soosan Naderi
    Kahani, Mohsen
    26TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2018), 2018, : 1511 - 1516
  • [2] Wind Power Ramp Event Forecasting Based on Feature Extraction and Deep Learning
    Han, Li
    Qiao, Yan
    Li, Mengjie
    Shi, Liping
    ENERGIES, 2020, 13 (23)
  • [3] Multi Frame Size Feature Extraction for Acoustic Event Detection
    Peng, Liqun
    Yang, Deshun
    Chen, Xiaoou
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [4] Feature extraction of underwater target acoustic signals based on deep manifold learning
    Zhou, Yu
    Wang, Jin
    Teng, Fei
    Pan, Bisheng
    Wang, Yourui
    Lei, Yingke
    Zhendong yu Chongji/Journal of Vibration and Shock, 2024, 43 (09): : 50 - 59
  • [5] Semantic feature extraction based on subspace learning with temporal constraints for acoustic event recognition
    Shi, Qiuying
    Han, Jiqing
    DIGITAL SIGNAL PROCESSING, 2021, 110
  • [6] Common subspace learning based semantic feature extraction method for acoustic event recognition
    Shi, Qiuying
    Deng, Shiwen
    Han, Jiqing
    APPLIED ACOUSTICS, 2022, 190
  • [7] Deep learning based text detection using resnet for feature extraction
    Li-Kun Huang
    Hsiao-Ting Tseng
    Chen-Chiung Hsieh
    Chih-Sin Yang
    Multimedia Tools and Applications, 2023, 82 : 46871 - 46903
  • [8] Deep learning based text detection using resnet for feature extraction
    Huang, Li-Kun
    Tseng, Hsiao-Ting
    Hsieh, Chen-Chiung
    Yang, Chih-Sin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (30) : 46871 - 46903
  • [9] Network intrusion detection method based on deep learning feature extraction
    Song Y.
    Hou B.
    Cai Z.
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2021, 49 (02): : 115 - 120
  • [10] A Survey: Neural Network-Based Deep Learning for Acoustic Event Detection
    Xianjun Xia
    Roberto Togneri
    Ferdous Sohel
    Yuanjun Zhao
    Defeng Huang
    Circuits, Systems, and Signal Processing, 2019, 38 : 3433 - 3453