Environment Sound Classification using Multiple Feature Channels and Attention based Deep Convolutional Neural Network

被引:35
|
作者
Sharma, Jivitesh [1 ]
Granmo, Ole-Christoffer [1 ]
Goodwin, Morten [1 ]
机构
[1] Univ Agder, Dept Informat & Commun Technol, Ctr Artificial Intelligence Res, Kristiansand, Norway
来源
INTERSPEECH 2020 | 2020年
关键词
Convolutional Neural Networks; Attention; Multiple Feature Channels; Environment Sound Classification;
D O I
10.21437/Interspeech.2020-1303
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In this paper, we propose a model for the Environment Sound Classification Task (ESC) that consists of multiple feature channels given as input to a Deep Convolutional Neural Network (CNN) with Attention mechanism. The novelty of the paper lies in using multiple feature channels consisting of Mel-Frequency Cepstral Coefficients (MFCC), Gammatone Frequency Cepstral Coefficients (GFCC), the Constant Q-transform (CQT) and Chromagram. And, we employ a deeper CNN (DCNN) compared to previous models, consisting of spatially separable convolutions working on time and feature domain separately. Alongside, we use attention modules that perform channel and spatial attention together. We use the mix-up data augmentation technique to further boost performance. Our model is able to achieve state-of-the-art performance on three benchmark environment sound classification datasets, i.e. the UrbanSound8K (97.52%), ESC-10 (94.75%) and ESC-50 (87.45%).
引用
收藏
页码:1186 / 1190
页数:5
相关论文
共 50 条
  • [11] Deep Convolutional Neural Network with Mixup for Environmental Sound Classification
    Zhang, Zhichao
    Xu, Shugong
    Cao, Shan
    Zhang, Shunqing
    PATTERN RECOGNITION AND COMPUTER VISION, PT II, 2018, 11257 : 356 - 367
  • [12] Convolutional Neural Network Based on Multiple Attention Mechanisms for Hyperspectral and LiDAR Classification
    Wang, Yingying
    Wang, Kun
    Ding, Zhiming
    SPATIAL DATA AND INTELLIGENCE, SPATIALDI 2024, 2024, 14619 : 274 - 287
  • [13] Animal Sound Classification Using A Convolutional Neural Network
    Sasmaz, Emre
    Tek, F. Boray
    2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2018, : 625 - 629
  • [14] Cyclostationary Feature Based Modulation Classification With Convolutional Neural Network in Multipath Fading Channels
    Yin, Liyan
    Xiang, Xin
    Liang, Yuan
    IEEE ACCESS, 2023, 11 : 105455 - 105465
  • [15] Pathological brain classification using multiple kernel-based deep convolutional neural network
    Dora, Lingraj
    Agrawal, Sanjay
    Panda, Rutuparna
    Pachori, Ram Bilas
    Neural Computing and Applications, 2024, 36 (02) : 747 - 756
  • [16] Pathological brain classification using multiple kernel-based deep convolutional neural network
    Dora, Lingraj
    Agrawal, Sanjay
    Panda, Rutuparna
    Pachori, Ram Bilas
    NEURAL COMPUTING & APPLICATIONS, 2023, 36 (2): : 747 - 756
  • [17] Pathological brain classification using multiple kernel-based deep convolutional neural network
    Lingraj Dora
    Sanjay Agrawal
    Rutuparna Panda
    Ram Bilas Pachori
    Neural Computing and Applications, 2024, 36 : 747 - 756
  • [18] Environmental sound classification using a regularized deep convolutional neural network with data augmentation
    Mushtaq, Zohaib
    Su, Shun-Feng
    APPLIED ACOUSTICS, 2020, 167
  • [19] Feature Extraction and Classification of Odor Using Attention Based Neural Network
    Fukuyama, Kohei
    Matsui, Kenji
    Omatsu, Sigeru
    Rivas, Alberto
    Manuel Corchado, Juan
    DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 16TH INTERNATIONAL CONFERENCE, 2020, 1003 : 142 - 149
  • [20] Discriminative Feature Learning for Skin Disease Classification Using Deep Convolutional Neural Network
    Ahmad, Belal
    Usama, Mohd
    Huang, Chuen-Min
    Hwang, Kai
    Hossain, M. Shamim
    Muhammad, Ghulam
    IEEE ACCESS, 2020, 8 (08): : 39025 - 39033