Audio Scene Classification with Deep Recurrent Neural Networks

被引:21
|
作者
Huy Phan [1 ,2 ]
Koch, Philipp [1 ]
Katzberg, Fabrice [1 ]
Maass, Marco [1 ]
Mazur, Radoslaw [1 ]
Mertins, Alfred [1 ]
机构
[1] Univ Lubeck, Inst Signal Proc, Lubeck, Germany
[2] Univ Lubeck, Grad Sch Comp Med & Life Sci, Lubeck, Germany
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
关键词
audio scene classification; deep neural networks; recurrent neural networks; GRU;
D O I
10.21437/Interspeech.2017-101
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce in this work an efficient approach for audio scene classification using deep recurrent neural networks. An audio scene is firstly transformed into a sequence of high-level label tree embedding feature vectors. The vector sequence is then divided into multiple subsequences on which a deep GRU-based recurrent neural network is trained for sequence-to-label classification. The global predicted label for the entire sequence is finally obtained via aggregation of subsequence classification outputs. We will show that our approach obtains an F1-score of 97.7% on the LITIS Rouen dataset. which is the largest dataset publicly available for the task. Compared to the best previously reported result on the dataset, our approach is able to reduce the relative classification error by 35.3%.
引用
收藏
页码:3043 / 3047
页数:5
相关论文
共 50 条
  • [1] Deep Recurrent Neural Networks for Audio Classification in Construction Sites
    Scarpiniti, Michele
    Comminiello, Danilo
    Uncini, Aurelio
    Lee, Yong-Cheol
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 810 - 814
  • [2] DEEP NEURAL NETWORKS FOR AUDIO SCENE RECOGNITION
    Petetin, Yohan
    Laroche, Cyrille
    Mayoue, Aurelien
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 125 - 129
  • [3] AUDIO CONCEPT CLASSIFICATION WITH HIERARCHICAL DEEP NEURAL NETWORKS
    Ravanelli, Mirco
    Elizalde, Benjamin
    Ni, Karl
    Friedland, Gerald
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 606 - 610
  • [4] Audio Event Classification Using Deep Neural Networks
    Kons, Zvi
    Toledo-Ronen, Orith
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1481 - 1485
  • [5] Deep Recurrent Neural Networks for Supernovae Classification
    Charnock, Tom
    Moss, Adam
    ASTROPHYSICAL JOURNAL LETTERS, 2017, 837 (02)
  • [6] Recurrent Deep Neural Networks for Nucleosome Classification
    Amato, Domenico
    Di Gangi, Mattia Antonino
    Lo Bosco, Giosue
    Rizzo, Riccardo
    COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, CIBB 2018, 2020, 11925 : 118 - 127
  • [7] Deep Learning Based Audio Scene Classification
    Sophiya, E.
    Jothilakshmi, S.
    COMPUTATIONAL INTELLIGENCE, CYBER SECURITY AND COMPUTATIONAL MODELS: MODELS AND TECHNIQUES FOR INTELLIGENT SYSTEMS AND AUTOMATION, 2018, 844 : 98 - 109
  • [8] APPLICATION OF RECURRENT AND DEEP NEURAL NETWORKS IN CLASSIFICATION TASKS
    Lima de Campos, Lidio Mauro
    Duarte, Danilo Souza
    REVISTA GESTAO & TECNOLOGIA-JOURNAL OF MANAGEMENT AND TECHNOLOGY, 2020, 20 (03): : 110 - 130
  • [9] Deep Recurrent Neural Networks for Hyperspectral Image Classification
    Mou, Lichao
    Ghamisi, Pedram
    Zhu, Xiao Xiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2017, 55 (07): : 3639 - 3655
  • [10] Violent Scene Detection Using Convolutional Neural Networks and Deep Audio Features
    Mu, Guankun
    Cao, Haibing
    Jin, Qin
    PATTERN RECOGNITION (CCPR 2016), PT II, 2016, 663 : 451 - 463