Feature Learning With Matrix Factorization Applied to Acoustic Scene Classification

被引:79
作者
Bisot, Victor [1 ]
Serizel, Romain [1 ,2 ,3 ,4 ]
Essid, Slim [1 ]
Richard, Gael [1 ]
机构
[1] Univ Paris Saclay, Telecom Paris Tech, LTCI, F-75013 Paris, France
[2] Univ Lorraine, LORIA, UMR 7503, F-54506 Vandoeuvre Les Nancy, France
[3] Inria, F-54600 Villers Les Nancy, France
[4] CNRS, LORIA, UMR 7503, F-54506 Vandoeuvre Les Nancy, France
关键词
Acoustic scene classification; feature learning; matrix factorization; RECOGNITION;
D O I
10.1109/TASLP.2017.2690570
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we study the usefulness of various matrix factorization methods for learning features to be used for the specific acoustic scene classification (ASC) problem. A common way of addressing ASC has been to engineer features capable of capturing the specificities of acoustic environments. Instead, we show that better representations of the scenes can be automatically learned from time-frequency representations using matrix factorization techniques. We mainly focus on extensions including sparse, kernel-based, convolutive and a novel supervised dictionary learning variant of principal component analysis and nonnegative matrix factorization. An experimental evaluation is performed on two of the largest ASC datasets available in order to compare and discuss the usefulness of these methods for the task. We show that the unsupervised learning methods provide better representations of acoustic scenes than the best conventional hand-crafted features on both datasets. Furthermore, the introduction of a novel nonnegative supervised matrix factorization model and deep neural networks trained on spectrograms, allow us to reach further improvements.
引用
收藏
页码:1216 / 1229
页数:14
相关论文
共 70 条
[1]  
[Anonymous], 2001, SPRINGER SERIES STAT, DOI [DOI 10.1007/978-0-387-21606-5, 10.1007/978-0-387-21606-5]
[2]  
[Anonymous], IEEE AASP CHALLENGE
[3]  
[Anonymous], 2016, TECH REP
[4]  
[Anonymous], 2013, IEEE Workshop on WASPAA, DOI DOI 10.1109/WASPAA.2013.6701819
[5]  
[Anonymous], 2010, ISMIR
[6]  
[Anonymous], 2015, TR2015023 MITS EL RE
[7]  
[Anonymous], 2009, Advances in Neural Information Processing Systems
[8]  
[Anonymous], 2013, 2013 IEEE WORKSHOP A, DOI DOI 10.1109/WASPAA.2013.6701857
[9]  
[Anonymous], 2009, NONNEGATIVE MATRIX T
[10]  
[Anonymous], 2016, P DET CLASS AC SCEN