Music Genre and Emotion Recognition Using Gaussian Processes

被引:55
作者
Markov, Konstantin [1 ]
Matsui, Tomoko [2 ]
机构
[1] Univ Aizu, Div Informat Syst, Aizu Wakamatsu, Fukushima 9658580, Japan
[2] Inst Stat Math, Dept Stat Modeling, Tokyo 1068569, Japan
来源
IEEE ACCESS | 2014年 / 2卷
关键词
Music genre classification; music emotion estimation; Gaussian processes; PROCESS DYNAMICAL MODELS; CLASSIFICATION; REGRESSION;
D O I
10.1109/ACCESS.2014.2333095
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Gaussian Processes (GPs) are Bayesian nonparametric models that are becoming more and more popular for their superior capabilities to capture highly nonlinear data relationships in various tasks, such as dimensionality reduction, time series analysis, novelty detection, as well as classical regression and classification tasks. In this paper, we investigate the feasibility and applicability of GP models for music genre classification and music emotion estimation. These are two of the main tasks in the music information retrieval (MIR) field. So far, the support vector machine (SVM) has been the dominant model used in MIR systems. Like SVM, GP models are based on kernel functions and Gram matrices; but, in contrast, they produce truly probabilistic outputs with an explicit degree of prediction uncertainty. In addition, there exist algorithms for GP hyperparameter learning-something the SVM framework lacks. In this paper, we built two systems, one for music genre classification and another for music emotion estimation using both SVM and GP models, and compared their performances on two databases of similar size. In all cases, the music audio signal was processed in the same way, and the effects of different feature extraction methods and their various combinations were also investigated. The evaluation experiments clearly showed that in both music genre classification and music emotion estimation tasks the GP performed consistently better than the SVM. The GP achieved a 13.6% relative genre classification error reduction and up to an 11% absolute increase of the coefficient of determination in the emotion estimation task.
引用
收藏
页码:688 / 697
页数:10
相关论文
共 51 条
  • [1] [Anonymous], 2010, P 27 INT C MACHINE L
  • [2] [Anonymous], 2010, Proc. ismir
  • [3] [Anonymous], ADV NEURAL INF PROCE
  • [4] [Anonymous], 2012, P INT S COMP MUS MOD
  • [5] [Anonymous], 2005, P INT C MUS INF RETR
  • [6] [Anonymous], 2007, P 24 INT C MACH LEAR, DOI DOI 10.1145/1273496.1273557
  • [7] Modeling of operators' emotion and task performance in a virtual driving environment
    Cai, Hua
    Lin, Yingzi
    [J]. INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 2011, 69 (09) : 571 - 586
  • [8] Cano Pedro, 2006, Tech. Rep. MTGTR- 2006-02
  • [9] Con tent-based music information retrieval: Current directions and future challenges
    Casey, Michael A.
    Veltkamp, Remco
    Goto, Masataka
    Leman, Marc
    Rhodes, Christophe
    Slaney, Malcolm
    [J]. PROCEEDINGS OF THE IEEE, 2008, 96 (04) : 668 - 696
  • [10] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)