Virtual Adversarial Training and Data Augmentation for Acoustic Event Detection with Gated Recurrent Neural Networks

被引:9
作者
Zoehrer, Matthias [1 ]
Pernkopf, Franz [1 ]
机构
[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, Intelligent Syst Grp, Graz, Austria
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
基金
奥地利科学基金会;
关键词
Acoustic event detection; gated recurrent networks; data augmentation; virtual adversarial training;
D O I
10.21437//nterspeech.2017-1238
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we use gated recurrent neural networks (GRNNs) for efficiently detecting environmental events of the IEEE Detection and Classification of Acoustic Scenes and Events challenge (DCASE2016). For this acoustic event detection task data is limited. Therefore, we propose data augmentation such as on-the fly shuffling and virtual adversarial training for regularization of the GRNNs. Both improve the performance using GRNNs. We obtain a segment-based error rate of 0.59 and an F-score of 58.6%.
引用
收藏
页码:493 / 497
页数:5
相关论文
共 22 条
  • [1] Adavanne S., 2016, TECH REP
  • [2] [Anonymous], 2014, ADV NEURAL INFORM PR
  • [3] Chen G., 2014, ICASSP, P4087, DOI DOI 10.1109/ICASSP.2014.6854370
  • [4] Chen GG, 2015, INT CONF ACOUST SPEE, P5236, DOI 10.1109/ICASSP.2015.7178970
  • [5] Cho K., 2014, ARXIV, P103, DOI 10.3115/v1/w14-4012
  • [6] Chung JY, 2015, PR MACH LEARN RES, V37, P2067
  • [7] Gencoglu O, 2014, EUR SIGNAL PR CONF, P506
  • [8] Generative Adversarial Networks
    Goodfellow, Ian
    Pouget-Abadie, Jean
    Mirza, Mehdi
    Xu, Bing
    Warde-Farley, David
    Ozair, Sherjil
    Courville, Aaron
    Bengio, Yoshua
    [J]. COMMUNICATIONS OF THE ACM, 2020, 63 (11) : 139 - 144
  • [9] Graves A, 2013, 2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), P273, DOI 10.1109/ASRU.2013.6707742
  • [10] Graves A, 2013, INT CONF ACOUST SPEE, P6645, DOI 10.1109/ICASSP.2013.6638947