Polyphonic Sound Event Detection Using Modified Recurrent Temporal Pyramid Neural Network

被引:0
|
作者
Venkatesh, Spoorthy [1 ]
Koolagudi, Shashidhar G. [1 ]
机构
[1] Natl Inst Technol Karnataka, Surathkal 575025, India
来源
COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT I | 2024年 / 2009卷
关键词
Polyphonic Sound Event Detection (SED); Constant Q-Transform (CQT); Deep learning; Modified Recurrent Temporal Pyramid Network; CLASSIFICATION; SCENES;
D O I
10.1007/978-3-031-58181-6_47
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel approach to performing polyphonic Sound Event Detection (SED) is presented. A new deep learning architecture named "Modified Recurrent Temporal Pyramid Neural Network (MR-TPNN)" is introduced. The input features fed to the network are spectrograms generated from Constant Q-Transform (CQT). CQT spectrograms provided better sound event information in the audio recording than the Short Time Fourier Transform (STFT) and Fast Fourier Transform (FFT) methods. The temporal information is an essential factor for detecting the onset and offset of events in an audio recording. Capturing the temporal information is ensured by fusing Temporal pyramids and Bi-directional long short-term memory (LSTM) recurrent layers in deep learning architecture. Extensive experiments are carried out on three benchmark datasets, and the results of the proposed method are superior to those of the existing polyphonic SED systems.
引用
收藏
页码:554 / 564
页数:11
相关论文
共 50 条
  • [21] Robust technique for environmental sound classification using convolutional recurrent neural network
    Bansal, Anam
    Garg, Naresh Kumar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (18) : 54755 - 54772
  • [22] Epilepsy Detection in EEG Signal Using Recurrent Neural Network
    Aliyu, Ibrahim
    Lim, Yong Beom
    Lim, Chang Gyoon
    2019 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, METAHEURISTICS & SWARM INTELLIGENCE (ISMSI 2019), 2019, : 50 - 53
  • [23] Parallel Capsule Neural Networks for Sound Event Detection
    Liang, Kai-Wen
    Tseng, Yu-Hao
    Chang, Pao-Chi
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1933 - 1936
  • [24] Technical Sound Event Classification Applying Recurrent and Convolutional Neural Networks
    Rieder, Constantin
    Germann, Markus
    Mezger, Samuel
    Scherer, Klaus
    PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON DEEP LEARNING THEORY AND APPLICATIONS (DELTA), 2020, : 84 - 88
  • [25] Classification of temporal sequences via prediction using the simple recurrent neural network
    Gupta, L
    McAvoy, M
    Phegley, J
    PATTERN RECOGNITION, 2000, 33 (10) : 1759 - 1770
  • [26] Event Detection and Classification Using Deep Compressed Convolutional Neural Network
    Swapnika, K.
    Vasumathi, D.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (12) : 312 - 322
  • [27] Glaucoma Detection Using Multiple Feature Set With Recurrent Neural Network
    Shyla, N. S. Jeya
    Emmanuel, W. R. Sam
    COMPUTER JOURNAL, 2023, 66 (10) : 2426 - 2436
  • [28] Analysis Text of Hate Speech Detection Using Recurrent Neural Network
    Saksesi, Arum Sucia
    Nasrun, Muhammad
    Setianingsih, Casi
    2018 INTERNATIONAL CONFERENCE ON CONTROL, ELECTRONICS, RENEWABLE ENERGY AND COMMUNICATIONS (ICCEREC), 2018, : 242 - 248
  • [29] Faults Detection and Classification in Electrical Secondary Distribution Network Using Recurrent Neural Network
    Mnyanghwalo, Daudi
    Kundaeli, Herald
    Kalinga, Ellen
    Ndyetabura, Hamisi
    2020 6TH IEEE INTERNATIONAL ENERGY CONFERENCE (ENERGYCON), 2020, : 958 - 966
  • [30] Electricity Theft Detection Using Deep Bidirectional Recurrent Neural Network
    Chen, Zhongtao
    Meng, De
    Zhang, Yufan
    Xin, Tinglin
    Xiao, Ding
    2020 22ND INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): DIGITAL SECURITY GLOBAL AGENDA FOR SAFE SOCIETY!, 2020, : 401 - 406