Environmental Sound Classification Based on CAR-Transformer Neural Network Model

被引:0
作者
Huaicheng Li
Aibin Chen
Jizheng Yi
Wenjie Chen
Daowu Yang
Guoxiong Zhou
Weixiong Peng
机构
[1] Central South University of Forestry and Technology,College of Computer and Information Engineering
[2] Hunan Zixing Artificial Intelligence Technology Group Co.,undefined
[3] Ltd,undefined
来源
Circuits, Systems, and Signal Processing | 2023年 / 42卷
关键词
Transformer; Attention; MFCC; Environment sound classification;
D O I
暂无
中图分类号
学科分类号
摘要
Environment Sound Classification (ESC) has been a challenging task in the audio field due to the different types of ambient sounds involved. In this paper, we propose a method for the ESC tasks based on the CAR-Transformer neural network model, which includes stages of sound sample pre-processing, deep learning-based feature extraction, and classifier classification. We convert the one-dimensional audio signal into two-dimensional Mel Frequency Cepstral Coefficients (MFCC) and use them as the feature map of the audio. The CAR-Transformer model was used for feature extraction, and after dimensionality reduction of the extracted feature map, we use the fully connected layer as a classifier of the feature map to obtain the final results. The method achieves a classification accuracy of 96.91% on the UrbanSound8K dataset, while the number of parameters in the model is only 0.16 M. The results of this paper were compared with other state-of-art research.
引用
收藏
页码:5289 / 5312
页数:23
相关论文
共 70 条
  • [1] Abdoli S(2019)End-to-end environmental sound classification using a 1D convolutional neural network Expert Syst. Appl. 1 252-263
  • [2] Cardinal P(2018)Innovative method for unsupervised voice activity detection and classification of audio segments Ieee Access 6 15494-15504
  • [3] Koerich AL(2017)Classifying environmental sounds using image recognition networks Procedia Comput. Sci. 112 2048-2056
  • [4] Ali Z(2016)Audio surveillance: a systematic review ACM Comput. Surv. (CSUR) 48 1-46
  • [5] Talha M(2020)A new pyramidal concatenated CNN approach for environmental sound classification Appl. Acoust. 170 125714-125721
  • [6] Boddapati V(2020)Environment sound event classification with a two-stream convolutional neural network IEEE Access. 8 1152-415
  • [7] Petef A(2020)Urban sound classification based on 2-order dense convolutional network using dual features Appl. Acoust. 164 411-191114
  • [8] Rasmusson J(2018)An ensemble stacked convolutional neural network model for environmental event sound recognition Appl. Sci. 8 191100-136
  • [9] Lundberg L(2021)Ensemble of handcrafted and deep features for urban sound classification Appl. Acoust. 175 1733-undefined
  • [10] Crocco M(2020)Masked conditional neural networks for sound classification Appl. Soft Comput. 90 126-undefined