Sparse Representation with Temporal Max-Smoothing for Acoustic Event Detection

被引:0
|
作者
Lu, Xugang [1 ]
Shen, Peng [1 ]
Tsao, Yu [2 ]
Hori, Chiori [1 ]
Kawai, Hisashi [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Koganei, Tokyo, Japan
[2] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei, Taiwan
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
Feature learning; matching pursuit; temporal max-smoothing; acoustic event detection;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In order to incorporate long temporal-frequency structure for acoustic event detection, we have proposed a spectral patch based learning and representation method. The learned spectral patches were regarded as acoustic words which were further used in sparse encoding for acoustic feature representation and modeling. In our previous study, during feature encoding stage, each spectral patch was encoded independently. Considering that spectral patches taken from a time sequence should keep similar representations for neighboring patches after encoding, in this study, we propose to enhance the temporal correlation of feature representation using a temporal max-smoothing algorithm. The max-smoothing tries to pick up the maximum response in a local time window as the representative feature for detection task. We tested the new feature for automatic detection of acoustic events which were selected from lecture audio data. Experimental results showed that the temporal max-smoothing significantly improved the performance.
引用
收藏
页码:1176 / 1180
页数:5
相关论文
共 50 条
  • [41] RANDOM FOREST REGRESSION BASED ACOUSTIC EVENT DETECTION WITH BOTTLENECK FEATURES
    Xia, Xianjun
    Togneri, Roberto
    Sohel, Ferdous
    Huang, David
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 157 - 162
  • [42] Prediction of the acoustic comfort of a dwelling based on automatic sound event detection
    Bonet-Sola, Daniel
    Vidana-Vila, Ester
    Alsina-Pages, Rosa Ma
    NOISE MAPPING, 2023, 10 (01)
  • [43] FEW-SHOT ACOUSTIC EVENT DETECTION VIA META LEARNING
    Shi, Bowen
    Sun, Ming
    Puvvada, Krishna C.
    Kao, Chieh-Chi
    Matsoukas, Spyros
    Wang, Chao
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 76 - 80
  • [44] SEMI-SUPERVISED ACOUSTIC EVENT DETECTION BASED ON TRI-TRAINING
    Shi, Bowen
    Sun, Ming
    Kao, Chieh-Chi
    Rozgic, Viktor
    Matsoukas, Spyros
    Wang, Chao
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 750 - 754
  • [45] A Survey: Neural Network-Based Deep Learning for Acoustic Event Detection
    Xianjun Xia
    Roberto Togneri
    Ferdous Sohel
    Yuanjun Zhao
    Defeng Huang
    Circuits, Systems, and Signal Processing, 2019, 38 : 3433 - 3453
  • [46] SPECTROGRAM PATCH BASED ACOUSTIC EVENT DETECTION AND CLASSIFICATION IN SPEECH OVERLAPPING CONDITIONS
    Espi, Miquel
    Fujimoto, Masakiyo
    Kubo, Yotaro
    Nakatani, Tomohiro
    2014 4TH JOINT WORKSHOP ON HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS (HSCMA), 2014, : 117 - 121
  • [47] Frame-wise dynamic threshold based polyphonic acoustic event detection
    Xia, Xianjun
    Togneri, Roberto
    Sohel, Ferdous
    Huang, David
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 474 - 478
  • [48] A Survey: Neural Network-Based Deep Learning for Acoustic Event Detection
    Xia, Xianjun
    Togneri, Roberto
    Sohel, Ferdous
    Zhao, Yuanjun
    Huang, Defeng
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (08) : 3433 - 3453
  • [49] Class-wise Centroid Distance Metric Learning for Acoustic Event Detection
    Lu, Xugang
    Shen, Peng
    Li, Sheng
    Tsao, Yu
    Kawai, Hisashi
    INTERSPEECH 2019, 2019, : 3614 - 3618
  • [50] A Blind Segmentation Approach to Acoustic Event Detection Based on I-Vector
    Huang, Zhen
    Cheng, You-Chi
    Li, Kehuang
    Hautamaki, Ville
    Lee, Chin-Hui
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2281 - 2285