ACOUSTIC SCENE CLASSIFICATION USING SPARSE FEATURE LEARNING AND EVENT-BASED POOLING

被引:0
作者
Lee, Kyogu [1 ]
Hyung, Ziwon [1 ]
Nam, Juhan [2 ]
机构
[1] Seoul Natl Univ, Mus & Audio Res Grp, Seoul 151, South Korea
[2] Stanford Univ, CCRMA, Stanford, CA 94305 USA
来源
2013 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA) | 2013年
关键词
acoustic scene classification; environmental sound; feature learning; restricted Boltzmann machine; sparse feature representation; max-pooling; event detection;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently unsupervised learning algorithms have been successfully used to represent data in many of machine recognition tasks. In particular, sparse feature learning algorithms have shown that they can not only discover meaningful structures from raw data but also outperform many hand-engineered features. In this paper, we apply the sparse feature learning approach to acoustic scene classification. We use a sparse restricted Boltzmann machine to capture manyfold local acoustic structures from audio data and represent the data in a high-dimensional sparse feature space given the learned structures. For scene classification, we summarize the local features by pooling over audio scene data. While the feature pooling is typically performed over uniformly divided segments, we suggest a new pooling method, which first detects audio events and then performs pooling only over detected events, considering the irregular occurrence of audio events in acoustic scene data. We evaluate the learned features on the IEEE AASP Challenge development set, comparing them with a baseline model using mel-frequency cepstral coefficients (MFCCs). The results show that learned features outperform MFCCs, event-based pooling achieves higher accuracy than uniform pooling and, furthermore, a combination of the two methods performs even better than either one used alone.
引用
收藏
页数:4
相关论文
共 8 条
[1]  
[Anonymous], 2008, Advances in neural information processing systems
[2]  
Cotton C. V., 2011, P IEEE INT C AC SPEE
[3]  
Henaff M., 2011, P 12 INT C MUS INF R
[4]   A fast learning algorithm for deep belief nets [J].
Hinton, Geoffrey E. ;
Osindero, Simon ;
Teh, Yee-Whye .
NEURAL COMPUTATION, 2006, 18 (07) :1527-1554
[5]  
Kusy B, 2009, 2009 INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS (IPSN 2009), P109
[6]  
Lyon R. F., 2010, NEURAL COMPUTATION, V22
[7]  
Nam J., 2012, P 13 INT C MUS INF R
[8]  
Wulng J., 2012, P 13 INT C MUS INF R