Importance of audio feature reduction in automatic music genre classification

被引：13

作者：

Baniya, Babu Kaji ^{[1
]}

Lee, Joonwhoan ^{[1
]}

机构：

[1] Chonbuk Natl Univ, Dept Comp Sci & Engn, Jeonju 561756, South Korea

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2016年 / 75卷 / 06期

基金：

新加坡国家研究基金会;

关键词：

Music genres; Dimensionality; Locality preserving projection; Non-negative matrix factorization; INFORMATION; RETRIEVAL;

D O I：

10.1007/s11042-014-2418-z

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multimedia database retrieval is rapidly growing and its popularity in online retrieval systems is consequently increasing. Large datasets are major challenges for searching, retrieving, and organizing the music content. Therefore, a robust automatic music-genre classification method is needed for organizing this music data into different classes according to specific viable information. Two fundamental components are to be considered for genre classification: audio feature extraction and classifier design. In this paper, we propose diverse audio features to precisely characterize the music content. The feature sets belong to four groups: dynamic, rhythmic, spectral, and harmonic. From the features, five statistical parameters are considered as representatives, including the fourth-order central moments of each feature as well as covariance components. Ultimately, insignificant representative parameters are controlled by minimum redundancy and maximum relevance. This algorithm calculates the score level of all feature attributes and orders them. Only high-score audio features are considered for genre classification. Moreover, we can recognize those audio features and distinguish which of the different statistical parameters derived from them are important for genre classification. Among them, mel frequency cepstral coefficient statistical parameters, such as covariance components and variance, are more frequently selected than the feature attributes of other groups. This approach does not transform the original features as do principal component analysis and linear discriminant analysis. In addition, other feature reduction methodologies, such as locality-preserving projection and non-negative matrix factorization are considered. The performance of the proposed system is measured based on the reduced features from the feature pool using different feature reduction techniques. The results indicate that the overall classification is competitive with existing state-of-the-art frame-based methodologies.

引用

页码：3013 / 3026

页数：14

共 29 条

[1]

[Anonymous], ADV NEURAL INFORM PR

[2]

[Anonymous], 2007, MIR MATLAB

[3]

[Anonymous], P INT C AC SPEECH SI

[4]

[Anonymous], 2012, P AAAI C ART INT

[5]

Baniya B. K., 2013, P IEEE WORKSH SIGN P

[6]

Baniya BK, 2014, INT CONF ADV COMMUN, P96, DOI 10.1109/ICACT.2014.6778929

[7]

Benetos E., 2008, P EUR SIGN PROC C LA

[8] Aggregate features and ADABOOST for music classification [J].

Bergstra, James ;

Casagrande, Norman ;

Erhan, Dumitru ;

Eck, Douglas ;

Kegl, Balazs .

MACHINE LEARNING, 2006, 65 (2-3) :473-484

[9] Con tent-based music information retrieval: Current directions and future challenges [J].

Casey, Michael A. ;

Veltkamp, Remco ;

Goto, Masataka ;

Leman, Marc ;

Rhodes, Christophe ;

Slaney, Malcolm .

PROCEEDINGS OF THE IEEE, 2008, 96 (04) :668-696

[10]

Cortes C., 1995, J MACH LEARN

← 1 2 3 →