Polyphonic Sound Event Detection Using Modified Recurrent Temporal Pyramid Neural Network

被引:0
|
作者
Venkatesh, Spoorthy [1 ]
Koolagudi, Shashidhar G. [1 ]
机构
[1] Natl Inst Technol Karnataka, Surathkal 575025, India
来源
COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT I | 2024年 / 2009卷
关键词
Polyphonic Sound Event Detection (SED); Constant Q-Transform (CQT); Deep learning; Modified Recurrent Temporal Pyramid Network; CLASSIFICATION; SCENES;
D O I
10.1007/978-3-031-58181-6_47
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel approach to performing polyphonic Sound Event Detection (SED) is presented. A new deep learning architecture named "Modified Recurrent Temporal Pyramid Neural Network (MR-TPNN)" is introduced. The input features fed to the network are spectrograms generated from Constant Q-Transform (CQT). CQT spectrograms provided better sound event information in the audio recording than the Short Time Fourier Transform (STFT) and Fast Fourier Transform (FFT) methods. The temporal information is an essential factor for detecting the onset and offset of events in an audio recording. Capturing the temporal information is ensured by fusing Temporal pyramids and Bi-directional long short-term memory (LSTM) recurrent layers in deep learning architecture. Extensive experiments are carried out on three benchmark datasets, and the results of the proposed method are superior to those of the existing polyphonic SED systems.
引用
收藏
页码:554 / 564
页数:11
相关论文
共 50 条
  • [41] Blur detection via deep pyramid network with recurrent distinction enhanced modules
    Sun, Xiaoli
    Zhang, Xiujun
    Xiao, Mingqing
    Xu, Chen
    NEUROCOMPUTING, 2020, 414 : 278 - 290
  • [42] HRNN: Hypergraph Recurrent Neural Network for Network Intrusion Detection
    Yang, Zhe
    Ma, Zitong
    Zhao, Wenbo
    Li, Lingzhi
    Gu, Fei
    JOURNAL OF GRID COMPUTING, 2024, 22 (02)
  • [43] Group intrusion detection in the Internet of Things using a hybrid recurrent neural network
    Belhadi, Asma
    Djenouri, Youcef
    Djenouri, Djamel
    Srivastava, Gautam
    Lin, Jerry Chun-Wei
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2023, 26 (02): : 1147 - 1158
  • [44] Sign Language Recognition with Recurrent Neural Network using Human Keypoint Detection
    Ko, Sang-Ki
    Son, Jae Gi
    Jung, Hyedong
    PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 326 - 328
  • [45] Parralel Recurrent Convolutional Neural Network for Abnormal Heart Sound Classification
    Gharehbaghi, Arash
    Partovi, Elaheh
    Babic, Ankica
    CARING IS SHARING-EXPLOITING THE VALUE IN DATA FOR HEALTH AND INNOVATION-PROCEEDINGS OF MIE 2023, 2023, 302 : 526 - 530
  • [46] Classification of Foods Using Spatial Pyramid Convolutional Neural Network
    Heravi, Elnaz J.
    Aghdam, Hamed H.
    Puig, Domenec
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2016, 288 : 163 - 168
  • [47] Sound Event Detection Using Multiple Optimized Kernels
    Xia, Xianjun
    Tognerie, Roberto
    Sohel, Ferdous
    Zhaoe, Yuanjun
    Huang, Defeng
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 1745 - 1754
  • [48] ROBUST SOUND EVENT RECOGNITION USING CONVOLUTIONAL NEURAL NETWORKS
    Zhang, Haomin
    McLoughlin, Ian
    Song, Yan
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 559 - 563
  • [49] Deep Recurrent Neural Networks for Prostate Cancer Detection: Analysis of Temporal Enhanced Ultrasound
    Azizi, Shekoofeh
    Bayat, Sharareh
    Yan, Pingkun
    Tahmasebi, Amir
    Kwak, Jin Tae
    Xu, Sheng
    Turkbey, Baris
    Choyke, Peter
    Pinto, Peter
    Wood, Bradford
    Mousavi, Parvin
    Abolmaesumi, Purang
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2018, 37 (12) : 2695 - 2703
  • [50] Recurrent Neural Network Parallelization for Hate Messages Detection
    Nguele, Thomas Messi
    Nzeko'o, Armel Jacques Nzekon
    Onana, Damase Donald
    RESEARCH IN COMPUTER SCIENCE, CRI 2023, 2024, 2085 : 154 - 165