Sound Event Detection via Conformer Recurrent Neural Networks

被引:0
|
作者
Gao, Fangqing [1 ]
Li, Xin [2 ]
Wei, Xiukun [1 ]
机构
[1] Beijing Jiaotong Univ, State Key Lab Rail Traff Control & Safety, Beijing 100044, Peoples R China
[2] Beijing Mass Transit Railway Operat CORP LTD, Operat Branch Co 2, Beijing 100044, Peoples R China
来源
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC | 2023年
关键词
Sound Event Detection; Conformer; Filter Augment; CRNN;
D O I
10.1109/CCDC58219.2023.10327134
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sound Event Detection (SED) is a critical subject in machine listening that aims to mimic the capacity of the human auditory system. Recently, convolutional recurrent neural networks (CRNN) have attained state-of-the-art SED performance. Local time-frequency information of audio are extracted using the convolution module in CRNN. However, global information cannot be obtained due to the size of the convolution kernel. Convolution module is replaced with conformer block module for the shortcoming, which combines the advantages of transformer and convolutional neural networks to successfully describe the local and global interdependence of audio sequences. When compared to CNN, RNN, and CRNN models using the TUT-SED 2017 dataset, the proposed method can improve F1-score by 9.86% and reduce ER by 0.1235 in the development dataset and improve F1-score by 9.13% and reduce ER by 0.0836 in the evaluation dataset. Experimental results demonstrate the superiority and effectiveness of the proposed approach.
引用
收藏
页码:4749 / 4754
页数:6
相关论文
共 50 条
  • [1] SOUND EVENT DETECTION VIA DILATED CONVOLUTIONAL RECURRENT NEURAL NETWORKS
    Li, Yanxiong
    Liu, Mingle
    Drossos, Konstantinos
    Virtanen, Tuomas
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 286 - 290
  • [2] Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection
    Cakir, Emre
    Parascandolo, Giambattista
    Heittola, Toni
    Huttunen, Heikki
    Virtanen, Tuomas
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) : 1291 - 1303
  • [3] Relational recurrent neural networks for polyphonic sound event detection
    Junbo Ma
    Ruili Wang
    Wanting Ji
    Hao Zheng
    En Zhu
    Jianping Yin
    Multimedia Tools and Applications, 2019, 78 : 29509 - 29527
  • [4] Relational recurrent neural networks for polyphonic sound event detection
    Ma, Junbo
    Wang, Ruili
    Ji, Wanting
    Zheng, Hao
    Zhu, En
    Yin, Jianping
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (20) : 29509 - 29527
  • [5] RECURRENT NEURAL NETWORKS FOR POLYPHONIC SOUND EVENT DETECTION IN REAL LIFE RECORDINGS
    Parascandolo, Giambattista
    Huttunen, Heikki
    Virtanen, Tuomas
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6440 - 6444
  • [6] Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks
    Adavanne, Sharath
    Politis, Archontis
    Nikunen, Joonas
    Virtanen, Tuomas
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (01) : 34 - 48
  • [7] Heart Sound Segmentation-An Event Detection Approach Using Deep Recurrent Neural Networks
    Messner, Elmar
    Zoehrer, Matthias
    Pernkopf, Franz
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2018, 65 (09) : 1964 - 1974
  • [8] Sound Event Localization and Detection Using Convolutional Recurrent Neural Networks and Gated Linear Units
    Komatsu, Tatsuya
    Togami, Masahito
    Takahashi, Tsubasa
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 41 - 45
  • [9] Parallel Capsule Neural Networks for Sound Event Detection
    Liang, Kai-Wen
    Tseng, Yu-Hao
    Chang, Pao-Chi
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1933 - 1936
  • [10] Sound Event Detection with Perturbed Residual Recurrent Neural Network
    Yuan, Shuang
    Yang, Lidong
    Guo, Yong
    ELECTRONICS, 2023, 12 (18)