Acoustic event recognition using cochleagram image and convolutional neural networks

被引:41
|
作者
Sharan, Roneel V. [1 ]
Moir, Tom J. [2 ]
机构
[1] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld 4072, Australia
[2] Auckland Univ Technol, Sch Engn, Private Bag 92006, Auckland 1142, New Zealand
关键词
Acoustic event recognition; Cochleagram; Convolutional neural network; Mel-spectrogram; Spectrogram; FEATURES; CLASSIFICATION;
D O I
10.1016/j.apacoust.2018.12.006
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Convolutional neural networks (CNN) have produced encouraging results in image classification tasks and have been increasingly adopted in audio classification applications. However, in using CNN for acoustic event recognition, the first hurdle is finding the best image representation of an audio signal. In this work, we evaluate the performance of four time-frequency representations for use with CNN. Firstly, we consider the conventional spectrogram image. Secondly, we apply moving average to the spectrogram along the frequency domain to obtain what we refer as the smoothed spectrogram. Thirdly, we use the mel-spectrogram which utilizes the mel-filter, as in mel-frequency cepstral coefficients. Finally, we propose the use of a cochleagram image the frequency components of which are based on the frequency selectivity property of the human cochlea. We test the proposed techniques on an acoustic event database containing 50 sound classes. The results show that the proposed cochleagram time-frequency image representation gives the best classification performance when used with CNN. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:62 / 66
页数:5
相关论文
共 50 条
  • [21] Medical Image Analysis using Convolutional Neural Networks: A Review
    Anwar, Syed Muhammad
    Majid, Muhammad
    Qayyum, Adnan
    Awais, Muhammad
    Alnowami, Majdi
    Khan, Muhammad Khurram
    JOURNAL OF MEDICAL SYSTEMS, 2018, 42 (11)
  • [22] Weighted pooling for image recognition of deep convolutional neural networks
    Xiaoning Zhu
    Qingyue Meng
    Bojian Ding
    Lize Gu
    Yixian Yang
    Cluster Computing, 2019, 22 : 9371 - 9383
  • [23] Parallelizing Convolutional Neural Networks for Action Event Recognition in Surveillance Videos
    Qicong Wang
    Jinhao Zhao
    Dingxi Gong
    Yehu Shen
    Maozhen Li
    Yunqi Lei
    International Journal of Parallel Programming, 2017, 45 : 734 - 759
  • [24] Parallelizing Convolutional Neural Networks for Action Event Recognition in Surveillance Videos
    Wang, Qicong
    Zhao, Jinhao
    Gong, Dingxi
    Shen, Yehu
    Li, Maozhen
    Lei, Yunqi
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2017, 45 (04) : 734 - 759
  • [25] Speech Emotion Recognition and Deep Learning: An Extensive Validation Using Convolutional Neural Networks
    Ri, Francesco Ardan Dal
    Ciardi, Fabio Cifariello
    Conci, Nicola
    IEEE ACCESS, 2023, 11 : 116638 - 116649
  • [26] Activity landscape image analysis using convolutional neural networks
    Iqbal, Javed
    Vogt, Martin
    Bajorath, Juergen
    JOURNAL OF CHEMINFORMATICS, 2020, 12 (01)
  • [27] FAST ACOUSTIC SCATTERING USING CONVOLUTIONAL NEURAL NETWORKS
    Fan, Ziqi
    Vineet, Vibhav
    Gamper, Hannes
    Raghuvanshi, Nikunj
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 171 - 175
  • [28] Recognition of Hand Gesture Image Using Deep Convolutional Neural Network
    Sagayam, K. Martin
    Andrushia, A. Diana
    Ghosh, Ahona
    Deperlioglu, Omer
    Elngar, Ahmed A.
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2022, 22 (03)
  • [29] Underwater single-channel acoustic signal multitarget recognition using convolutional neural networks
    Sun, Qinggang
    Wang, Kejun
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2022, 151 (03) : 2245 - 2254
  • [30] Robust acoustic event recognition using AVMD-PWVD time-frequency image
    Zhang, Yanhua
    Zhang, Ke
    Wang, Jingyu
    Su, Yu
    APPLIED ACOUSTICS, 2021, 178