Acoustic event recognition using cochleagram image and convolutional neural networks

被引:41
|
作者
Sharan, Roneel V. [1 ]
Moir, Tom J. [2 ]
机构
[1] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld 4072, Australia
[2] Auckland Univ Technol, Sch Engn, Private Bag 92006, Auckland 1142, New Zealand
关键词
Acoustic event recognition; Cochleagram; Convolutional neural network; Mel-spectrogram; Spectrogram; FEATURES; CLASSIFICATION;
D O I
10.1016/j.apacoust.2018.12.006
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Convolutional neural networks (CNN) have produced encouraging results in image classification tasks and have been increasingly adopted in audio classification applications. However, in using CNN for acoustic event recognition, the first hurdle is finding the best image representation of an audio signal. In this work, we evaluate the performance of four time-frequency representations for use with CNN. Firstly, we consider the conventional spectrogram image. Secondly, we apply moving average to the spectrogram along the frequency domain to obtain what we refer as the smoothed spectrogram. Thirdly, we use the mel-spectrogram which utilizes the mel-filter, as in mel-frequency cepstral coefficients. Finally, we propose the use of a cochleagram image the frequency components of which are based on the frequency selectivity property of the human cochlea. We test the proposed techniques on an acoustic event database containing 50 sound classes. The results show that the proposed cochleagram time-frequency image representation gives the best classification performance when used with CNN. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:62 / 66
页数:5
相关论文
共 50 条
  • [31] Indian Art Form Recognition Using Convolutional Neural Networks.
    Kumar, Sonu
    Tyagi, Arjun
    Sahu, Tarpit
    Shukla, Pushkar
    Mittal, Ankush
    2018 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2018, : 800 - 804
  • [32] Robust Place Recognition using Convolutional Neural Networks
    Lugo Sanchez, Omar E.
    Sossa, Humberto
    Zamora, Erik
    COMPUTACION Y SISTEMAS, 2020, 24 (04): : 1589 - 1605
  • [33] Ear Recognition Using Pretrained Convolutional Neural Networks
    Resmi, K. R.
    Raju, G.
    ADVANCES IN COMPUTING AND DATA SCIENCES, PT I, 2021, 1440 : 720 - 728
  • [34] Human Activity Recognition Using Convolutional Neural Networks
    Awad, Omer Fawzi
    Ahmed, Saadaldeen Rashid
    Shaker, Atheel Sabih
    Majeed, Duaa A.
    Hussain, Abadal-Salam T.
    Taha, Taha A.
    FORTHCOMING NETWORKS AND SUSTAINABILITY IN THE AIOT ERA, VOL 1, FONES-AIOT 2024, 2024, 1035 : 258 - 274
  • [35] RECOGNITION OF SPOOFED VOICE USING CONVOLUTIONAL NEURAL NETWORKS
    Liang, Huixin
    Lin, Xiaodan
    Zhang, Qiong
    Kang, Xiangui
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 293 - 297
  • [36] Marine Objects Recognition Using Convolutional Neural Networks
    Lorencin, Ivan
    Andelic, Nikola
    Mrzljak, Vedran
    Car, Zlatan
    NASE MORE, 2019, 66 (03): : 112 - 119
  • [37] Human Activity Recognition Using Convolutional Neural Networks
    Dogan, Gulustan
    Ertas, Sinem Sena
    Cay, Iremnaz
    2021 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB), 2021, : 76 - 80
  • [38] Sign Language Recognition Using Convolutional Neural Networks
    Pigou, Lionel
    Dieleman, Sander
    Kindermans, Pieter-Jan
    Schrauwen, Benjamin
    COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I, 2015, 8925 : 572 - 578
  • [39] Driving Posture Recognition by Convolutional Neural Networks
    Yan, Chao
    Zhang, Bailing
    Coenen, Frans
    2015 11TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC), 2015, : 680 - 685
  • [40] RGB-D-Based Object Recognition Using Multimodal Convolutional Neural Networks: A Survey
    Gao, Mingliang
    Jiang, Jun
    Zou, Guofeng
    John, Vijay
    Liu, Zheng
    IEEE ACCESS, 2019, 7 : 43110 - 43136