Analyzing the Perceptual Salience of Audio Features for Musical Emotion Recognition

被引：0

作者：

Schmidt, Erik M. ^{[1
]}

Prockup, Matthew ^{[1
]}

Scott, Jeffrey ^{[1
]}

Dolhansky, Brian ^{[2
]}

Morton, Brandon G. ^{[1
]}

Kim, Youngmoo E. ^{[1
]}

机构：

[1] Drexel Univ, Philadelphia, PA USA

[2] Univ Penn, Philadelphia, PA 19104 USA

来源：

FROM SOUNDS TO MUSIC AND EMOTIONS | 2013年 / 7900卷

关键词：

emotion; music emotion recognition; features; acoustic features; machine learning; invariance; MODE; RESPONSES; TEMPO;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

While the organization of music in terms of emotional affect is a natural process for humans, quantifying it empirically proves to be a very difficult task. Consequently, no acoustic feature (or combination thereof) has emerged as the optimal representation for musical emotion recognition. Due to the subjective nature of emotion, determining whether an acoustic feature domain is informative requires evaluation by human subjects. In this work, we seek to perceptually evaluate two of the most commonly used features in music information retrieval: mel-frequency cepstral coefficients and chroma. Furthermore, to identify emotion-informative feature domains, we explore which musical features are most relevant in determining emotion perceptually, and which acoustic feature domains are most variant or invariant to those changes. Finally, given our collected perceptual data, we conduct an extensive computational experiment for emotion prediction accuracy on a large number of acoustic feature domains, investigating pairwise prediction both in the context of a general corpus as well as in the context of a corpus that is constrained to contain only specific musical feature transformations.

引用

页码：278 / 300

页数：23