Addressing Missing Labels in Large-Scale Sound Event Recognition Using a Teacher-Student Framework With Loss Masking

被引:13
作者
Fonseca, Eduardo [1 ]
Hershey, Shawn [2 ]
Plakal, Manoj [2 ]
Ellis, Daniel P. W. [2 ]
Jansen, Aren [2 ]
Moore, R. Channing [2 ]
机构
[1] Univ Pompeu Fabra, Mus Technol Grp, Barcelona 08002, Spain
[2] Google Res, New York, NY 10011 USA
关键词
Sound event recognition; label noise; missing labels; teacher-student; loss masking;
D O I
10.1109/LSP.2020.3006378
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The study of label noise in sound event recognition has recently gained attention with the advent of larger and noisier datasets. This work addresses the problem of missing labels, one of the big weaknesses of large audio datasets, and one of the most conspicuous issues for AudioSet. We propose a simple and model-agnostic method based on a teacher-student framework with loss masking to first identify the most critical missing label candidates, and then ignore their contribution during the learning process. We find that a simple optimisation of the training label set improves recognition performance without additional computation. We discover that most of the improvement comes from ignoring a critical tiny portion of the missing labels. We also show that the damage done by missing labels is larger as the training set gets smaller, yet it can still be observed even when training with massive amounts of audio. We believe these insights can generalize to other large-scale datasets.
引用
收藏
页码:1235 / 1239
页数:5
相关论文
共 29 条
  • [1] [Anonymous], 2018, ARXIV181109967
  • [2] Ba LJ, 2014, ADV NEUR IN, V27
  • [3] Fonseca E., 2017, P 18 ISMIR C INT SOC, P486, DOI DOI 10.5281/ZENODO.1417159
  • [4] Fonseca E., 2018, P DET CLASS AC SCEN, P69
  • [5] Fonseca E, 2019, IEEE WORK APPL SIG, P16, DOI [10.1109/waspaa.2019.8937249, 10.1109/WASPAA.2019.8937249]
  • [6] Fonseca E, 2019, INT CONF ACOUST SPEE, P21, DOI 10.1109/ICASSP.2019.8683158
  • [7] Occupational Health of Pre-Hospital Emergency Technicians: The Contribution of Trauma and Coping
    Fonseca, Silvia M.
    Cunha, Sonia
    Campos, Rui
    Goncalves, Sonia P.
    Queiros, Cristina
    [J]. INTERNATIONAL JOURNAL ON WORKING CONDITIONS, 2019, (17): : 69 - 88
  • [8] A Deep Residual Network for Large-Scale Acoustic Scene Analysis
    Ford, Logan
    Tang, Hao
    Grondin, Francois
    Glass, James
    [J]. INTERSPEECH 2019, 2019, : 2568 - 2572
  • [9] Foster Peter, 2015, 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). Proceedings, P1, DOI 10.1109/WASPAA.2015.7336899
  • [10] Classification in the Presence of Label Noise: a Survey
    Frenay, Benoit
    Verleysen, Michel
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (05) : 845 - 869