MUSIC GENRE CLASSIFICATION VIA TOPOLOGY PRESERVING NON-NEGATIVE TENSOR FACTORIZATION AND SPARSE REPRESENTATIONS

被引：20

作者：

Panagakis, Yannis ^{[1
]}

Kotropoulos, Constantine ^{[1
]}

机构：

[1] Aristotle Univ Thessaloniki, Dept Informat, Thessaloniki 54124, Greece

来源：

2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年

关键词：

Music genre classification; topology preserving; non-negative tensor factorization; sparse representations; MATRIX FACTORIZATION; FACE RECOGNITION; FEATURES; SIGNALS;

D O I：

10.1109/ICASSP.2010.5495984

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Motivated by the rich, psycho-physiologically grounded properties of auditory cortical representations and the power of sparse representation-based classifiers, we propose a robust music genre classification framework. Its first pilar is a novel multilinear subspace analysis method that reduces the dimensionality of cortical representations of music signals, while preserving the topology of the cortical representations. Its second pilar is the sparse representation based classification, that models any test cortical representation as a sparse weighted sum of dictionary atoms, which stem from training cortical representations of known genre, by assuming that the representations of music recordings of the same genre are close enough in the tensor space they lie. Accordingly, the dimensionality reduction is made in a compatible manner to the working principle of the sparse-representation based classification. Music genre classification accuracy of 93.7% and 94.93% is reported on the GTZAN and the ISMIR2004 Genre datasets, respectively. Both accuracies outperform any accuracy ever reported for state of the art music genre classification algorithms applied to the aforementioned datasets.

引用

页码：249 / 252

页数：4

共 16 条

[1]

[Anonymous], P 17 EUR SIGN PROC C

[2] Representing musical genre: A state of the art [J].

Aucouturier, JJ ;

Pachet, F .

JOURNAL OF NEW MUSIC RESEARCH, 2003, 32 (01) :83-93

[3]

BENETOS E, 2008, P 16 EUR SIGN PROC C

[4] Aggregate features and ADABOOST for music classification [J].

Bergstra, James ;

Casagrande, Norman ;

Erhan, Dumitru ;

Eck, Douglas ;

Kegl, Balazs .

MACHINE LEARNING, 2006, 65 (2-3) :473-484

[5]

Chan T, 2005, HDB MATH MODELS COMP

[6] Tensor Decompositions and Applications [J].

Kolda, Tamara G. ;

Bader, Brett W. .

SIAM REVIEW, 2009, 51 (03) :455-500

[7] On the convergence of multiplicative update algorithms for nonnegative matrix factorization [J].

Lin, Chih-Jen .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 2007, 18 (06) :1589-1596

[8] MPCA: Multilinear principal component analysis of tensor objects [J].

Lu, Haiping ;

Konstantinos, N. Platardotis ;

Venetsanopoulos, Anastasios N. .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (01) :18-39

[9]

PAMPALK E, 2005, P 6 INT S MUS INF RE

[10]

Panagakis I., 2008, P 9 INT S MUS INF RE

← 1 2 →