Sparse Representation with Temporal Max-Smoothing for Acoustic Event Detection

被引:0
|
作者
Lu, Xugang [1 ]
Shen, Peng [1 ]
Tsao, Yu [2 ]
Hori, Chiori [1 ]
Kawai, Hisashi [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Koganei, Tokyo, Japan
[2] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei, Taiwan
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
Feature learning; matching pursuit; temporal max-smoothing; acoustic event detection;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In order to incorporate long temporal-frequency structure for acoustic event detection, we have proposed a spectral patch based learning and representation method. The learned spectral patches were regarded as acoustic words which were further used in sparse encoding for acoustic feature representation and modeling. In our previous study, during feature encoding stage, each spectral patch was encoded independently. Considering that spectral patches taken from a time sequence should keep similar representations for neighboring patches after encoding, in this study, we propose to enhance the temporal correlation of feature representation using a temporal max-smoothing algorithm. The max-smoothing tries to pick up the maximum response in a local time window as the representative feature for detection task. We tested the new feature for automatic detection of acoustic events which were selected from lecture audio data. Experimental results showed that the temporal max-smoothing significantly improved the performance.
引用
收藏
页码:1176 / 1180
页数:5
相关论文
共 50 条
  • [1] SPARSE REPRESENTATION BASED ON A BAG OF SPECTRAL EXEMPLARS FOR ACOUSTIC EVENT DETECTION
    Lu, Xugang
    Tsao, Yu
    Matsuda, Shigeki
    Hori, Chiori
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [2] On Learning Disentangled Representation for Acoustic Event Detection
    Gao, Lijian
    Mao, Qirong
    Dong, Ming
    Jing, Yu
    Chinnam, Ratna
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2006 - 2014
  • [3] Spectral Patch Based Sparse Coding for Acoustic Event Detection
    Lu, Xugang
    Tsao, Yu
    Shen, Peng
    Hori, Chiori
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 317 - +
  • [4] Temporal attentive pooling for acoustic event detection ocr
    Lu, Xugang
    Shen, Peng
    Li, Sheng
    Tsao, Yu
    Kawai, Hisashi
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1354 - 1357
  • [5] Reproducibility Companion Paper: On Learning Disentangled Representation for Acoustic Event Detection
    Gao, Lijian
    Mao, Qirong
    Chen, Jingjing
    Dong, Ming
    Chinnam, Ratna
    Sassatelli, Lucile
    Rondon, Miguel Romero
    Sharma, Ujjwal
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3638 - 3641
  • [6] SPECTRAL VS. SPECTRO-TEMPORAL FEATURES FOR ACOUSTIC EVENT DETECTION
    Cotton, Courtenay V.
    Ellis, Daniel P. W.
    2011 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2011, : 69 - 72
  • [7] Structural Sparse Representation for Object Detection
    FANG Wenhua
    CHEN Jun
    HU Ruimin
    Wuhan University Journal of Natural Sciences, 2017, 22 (04) : 318 - 322
  • [8] Exploiting spectro-temporal locality in deep learning based acoustic event detection
    Miquel Espi
    Masakiyo Fujimoto
    Keisuke Kinoshita
    Tomohiro Nakatani
    EURASIP Journal on Audio, Speech, and Music Processing, 2015
  • [9] Exploiting spectro-temporal locality in deep learning based acoustic event detection
    Espi, Miquel
    Fujimoto, Masakiyo
    Kinoshita, Keisuke
    Nakatani, Tomohiro
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015,
  • [10] Kernel Sparse Representation for Hyperspectral Target Detection
    Chen, Yi
    Nasrabadi, Nasser M.
    Tran, Trac D.
    ALGORITHMS AND TECHNOLOGIES FOR MULTISPECTRAL, HYPERSPECTRAL, AND ULTRASPECTRAL IMAGERY XVIII, 2012, 8390