What Makes Audio Event Detection Harder than Classification?

被引:0
作者
Huy Phan [1 ,2 ]
Koch, Philipp [1 ]
Katzberg, Fabrice [1 ]
Maass, Marco [1 ]
Mazur, Radoslaw [1 ]
McLoughlin, Ian [3 ]
Mertins, Alfred [1 ]
机构
[1] Univ Lubeck, Inst Signal Proc, Lubeck, Germany
[2] Univ Lubeck, Grad Sch Comp Med & Life Sci, Lubeck, Germany
[3] Univ Kent, Sch Comp, Canterbury, Kent, England
来源
2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2017年
关键词
FEATURES; RECOGNITION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
There is a common observation that audio event classification is easier to deal with than detection. So far, this observation has been accepted as a fact and we lack of a careful analysis. In this paper, we reason the rationale behind this fact and, more importantly, leverage them to benefit the audio event detection task. We present an improved detection pipeline in which a verification step is appended to augment a detection system. This step employs a high-quality event classifier to postprocess the benign event hypotheses outputted by the detection system and reject false alarms. To demonstrate the effectiveness of the proposed pipeline, we implement and pair up different event detectors based on the most common detection schemes and various event classifiers, ranging from the standard bag-of-words model to the state-of-the-art bank-of-regressors one. Experimental results on the ITC-Irst dataset show significant improvements to detection performance. More importantly, these improvements are consistent for all detector-classifier combinations.
引用
收藏
页码:2739 / 2743
页数:5
相关论文
共 26 条
  • [1] [Anonymous], 2015, P IEEE INT JOINT C N, DOI [DOI 10.1109/IJCNN.2015.7280624, 10.1109/IJCNN.2015.7280624]
  • [2] [Anonymous], IEEE AASP CHALLENGE
  • [3] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [4] Overlapping sound event recognition using local spectrogram features and the generalised hough transform
    Dennis, J.
    Tran, H. D.
    Chng, E. S.
    [J]. PATTERN RECOGNITION LETTERS, 2013, 34 (09) : 1085 - 1093
  • [5] Image Feature Representation of the Subband Power Distribution for Robust Sound Event Classification
    Dennis, Jonathan
    Tran, Huy Dat
    Chng, Eng Siong
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (02): : 367 - 377
  • [6] Giannoulis P, 2014, EUR SIGNAL PR CONF, P2375
  • [7] Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks
    Huy Phan
    Hertel, Lars
    Maass, Marco
    Mertins, Alfred
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3653 - 3657
  • [8] Learning Representations for Nonspeech Audio Events Through Their Similarities to Speech Patterns
    Huy Phan
    Hertel, Lars
    Maass, Marco
    Mazur, Radoslaw
    Mertins, Alfred
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (04) : 807 - 822
  • [9] Random Regression Forests for Acoustic Event Detection and Classification
    Huy Phan
    Maass, Marco
    Mazur, Radoslaw
    Mertins, Alfred
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (01) : 20 - 31
  • [10] Lazebnik S., 2006, P IEEE COMPUTER SOC, V2, P2169, DOI 10.1109/CVPR.2006.68