Robust Sound Event Classification by Using Denoising Autoencoder

被引：0

作者：

Zhou, Jianchao ^{[1
]}

Peng, Liqun ^{[1
]}

Chen, Xiaoou ^{[1
]}

Yang, Deshun ^{[1
]}

机构：

[1] Peking Univ, Inst Comp Sci & Technol, 128 Zhongguancun North St, Beijing 100871, Peoples R China

来源：

2016 IEEE 18TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2016年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Over the last decade, a lot of research has been done on sound event classification. But a main problem with sound event classification is that the performance sharply degrades in the presence of noise. As spectrogram-based image features and denoising autoencoder reportedly have superior performance in noisy conditions, this paper proposes a new robust feature called denoising autoencoder image feature (DIF) for sound event classification which is an image feature extracted from an image-like representation produced by denoising autoencoder. Performance of the feature is evaluated by a classification experiment using a SVM classifier on audio examples with different noise levels, and compared with that of baseline features including mel-frequency cepstral coefficients (MFCC) and spectrogram image feature. The proposed DIF demonstrates better performance under noise-corrupted conditions.

引用

页数：6

共 23 条

[1]

[Anonymous], 2006, NIPS

[2]

[Anonymous], 2000, LREC

[3]

Chao LL, 2014, 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), P341, DOI 10.1109/ISCSLP.2014.6936627

[4]

Da Vitoria Mosteiro S.Bento, 2012, PROC ISMIR, P325

[5] Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition [J].

Deng, Jun ;

Zhang, Zixing ;

Eyben, Florian ;

Schuller, Bjoern .

IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (09) :1068-1072

[6]

Deng L, 2013, INT CONF ACOUST SPEE, P8599, DOI 10.1109/ICASSP.2013.6639344

[7] Spectrogram Image Feature for Sound Event Classification in Mismatched Conditions [J].

Dennis, Jonathan ;

Tran, Huy Dat ;

Li, Haizhou .

IEEE SIGNAL PROCESSING LETTERS, 2011, 18 (02) :130-133

[8]

Gehring J, 2013, INT CONF ACOUST SPEE, P3377, DOI 10.1109/ICASSP.2013.6638284

[9]

Ishii T, 2013, INTERSPEECH, P3479

[10]

Kusy B, 2009, 2009 INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS (IPSN 2009), P109

← 1 2 3 →