Detecting novel objects in acoustic scenes through classifier incongruence

被引:0
作者
Bach, Joerg-Hendrik [1 ]
Anemueller, Joern [1 ]
机构
[1] Carl von Ossietzky Univ Oldenburg, D-26111 Oldenburg, Germany
来源
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年
关键词
sound classification; acoustic objects; event detection; novelty detection; modulation spectrogram; NOVELTY DETECTION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this study, a new generic framework for the detection and interpretation of disagreement ("incongruence") between different classifiers [1] is applied to the problem of detecting novel acoustic objects in an office environment. Using a general model that detects generic acoustic objects (standing out from a stationary background) and specific models tuned to particular sounds expected in the office, a novel object is detected as an incongruence between the models: the general model detects it as a generic object, but the specific models can not identify it as any of the known office-related sources. The detectors are realized using amplitude modulation spectrogram and RASTA-PLP features with support vector machine classification. Data considered are speech and non-speech sounds embedded in real office background at signal-to-noise ratios (SNR) from +20 dB to -20 dB. Our approach yields approximately 90% hit rate for novel events at -20 dB SNR, 75% at 0 dB and reaches chance level below -10 dB.
引用
收藏
页码:2206 / 2209
页数:4
相关论文
共 15 条
  • [1] Anemüller J, 2008, INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, P2582
  • [2] The bag-of-frames approach to audio pattern recognition: A sufficient model for urban soundscapes but not for polyphonic music
    Aucouturier, Jean-Julien
    Defreville, Boris
    Pachet, Francois
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2007, 122 (02) : 881 - 891
  • [3] MODULATION-BASED DETECTION OF SPEECH IN REAL BACKGROUND NOISE: GENERALIZATION TO NOVEL BACKGROUND CLASSES
    Bach, Joerg-Hendrik
    Kollmeier, Birger
    Anemueller, Joern
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 41 - 44
  • [4] Bugalho M, 2009, INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, P1147
  • [5] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [6] Flexer A., 2005, P C INT SOC MUS INF, P260
  • [7] Foote J., 1999, P IEEE C MULT EXP 19, P452
  • [8] RASTA Processing of Speech
    Hermansky, Hynek
    Morgan, Nelson
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04): : 578 - 589
  • [9] Jie L., 2008, P ICVS SANT GREEC
  • [10] Novelty detection: a review - part 2: neural network based approaches
    Markou, M
    Singh, S
    [J]. SIGNAL PROCESSING, 2003, 83 (12) : 2499 - 2521