Musical Emotion Recognition with Spectral Feature Extraction Based on a Sinusoidal Model with Model-Based and Deep-Learning Approaches

被引：5

作者：

Xie, Baijun ^{[1
]}

Kim, Jonathan C. ^{[1
]}

Park, Chung Hyuk ^{[1
]}

机构：

[1] George Washington Univ, Dept Biomed Engn, Washington, DC 20052 USA

来源：

APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 03期

基金：

美国国家科学基金会; 美国国家卫生研究院;

关键词：

musical emotion recognition; spectral feature extraction; sinusoidal model; principal component regression; deep learning; machine learning; PITCH;

D O I：

10.3390/app10030902

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

This paper presents a method for extracting novel spectral features based on a sinusoidal model. The method is focused on characterizing the spectral shapes of audio signals using spectral peaks in frequency sub-bands. The extracted features are evaluated for predicting the levels of emotional dimensions, namely arousal and valence. Principal component regression, partial least squares regression, and deep convolutional neural network (CNN) models are used as prediction models for the levels of the emotional dimensions. The experimental results indicate that the proposed features include additional spectral information that common baseline features may not include. Since the quality of audio signals, especially timbre, plays a major role in affecting the perception of emotional valence in music, the inclusion of the presented features will contribute to decreasing the prediction error rate.

引用

页数：11

共 28 条

[1] [Anonymous], EMOTION MUSIC TASK M
[2] [Anonymous], P IEEE C COMP VIS PA
[3] [Anonymous], 2009, WORKSH CONT AW REC S
[4] The way it sounds: Timbre models for analysis and retrieval of music signals
Aucouturier, JJ
Pachet, F
Sandler, M
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2005, 7 (06) : 1028 - 1035
[5] Badshah AM, 2017, 2017 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), P125
[6] Eyben F, 2010, P 18 ACM INT C MULT
[7] GEORGE EB, 1992, J AUDIO ENG SOC, V40, P497
[8] It's not what you play, it's how you play it: Timbre affects perception of emotion in music
Hailstone, Julia C.
Omar, Rohani
Henley, Susie M. D.
Frost, Chris
Kenward, Michael G.
Warren, Jason D.
[J]. QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2009, 62 (11) : 2141 - 2155
[9] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[10] Densely Connected Convolutional Networks
Huang, Gao
Liu, Zhuang
van der Maaten, Laurens
Weinberger, Kilian Q.
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2261 - 2269

← 1 2 3 →