Robust Sound Event Classification by Using Denoising Autoencoder

被引:0
作者
Zhou, Jianchao [1 ]
Peng, Liqun [1 ]
Chen, Xiaoou [1 ]
Yang, Deshun [1 ]
机构
[1] Peking Univ, Inst Comp Sci & Technol, 128 Zhongguancun North St, Beijing 100871, Peoples R China
来源
2016 IEEE 18TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2016年
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Over the last decade, a lot of research has been done on sound event classification. But a main problem with sound event classification is that the performance sharply degrades in the presence of noise. As spectrogram-based image features and denoising autoencoder reportedly have superior performance in noisy conditions, this paper proposes a new robust feature called denoising autoencoder image feature (DIF) for sound event classification which is an image feature extracted from an image-like representation produced by denoising autoencoder. Performance of the feature is evaluated by a classification experiment using a SVM classifier on audio examples with different noise levels, and compared with that of baseline features including mel-frequency cepstral coefficients (MFCC) and spectrogram image feature. The proposed DIF demonstrates better performance under noise-corrupted conditions.
引用
收藏
页数:6
相关论文
共 23 条
[1]  
[Anonymous], 2006, NIPS
[2]  
[Anonymous], 2000, LREC
[3]  
Chao LL, 2014, 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), P341, DOI 10.1109/ISCSLP.2014.6936627
[4]  
Da Vitoria Mosteiro S.Bento, 2012, PROC ISMIR, P325
[5]   Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition [J].
Deng, Jun ;
Zhang, Zixing ;
Eyben, Florian ;
Schuller, Bjoern .
IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (09) :1068-1072
[6]  
Deng L, 2013, INT CONF ACOUST SPEE, P8599, DOI 10.1109/ICASSP.2013.6639344
[7]   Spectrogram Image Feature for Sound Event Classification in Mismatched Conditions [J].
Dennis, Jonathan ;
Tran, Huy Dat ;
Li, Haizhou .
IEEE SIGNAL PROCESSING LETTERS, 2011, 18 (02) :130-133
[8]  
Gehring J, 2013, INT CONF ACOUST SPEE, P3377, DOI 10.1109/ICASSP.2013.6638284
[9]  
Ishii T, 2013, INTERSPEECH, P3479
[10]  
Kusy B, 2009, 2009 INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS (IPSN 2009), P109