Analyzing the Perceptual Salience of Audio Features for Musical Emotion Recognition

被引:0
作者
Schmidt, Erik M. [1 ]
Prockup, Matthew [1 ]
Scott, Jeffrey [1 ]
Dolhansky, Brian [2 ]
Morton, Brandon G. [1 ]
Kim, Youngmoo E. [1 ]
机构
[1] Drexel Univ, Philadelphia, PA USA
[2] Univ Penn, Philadelphia, PA 19104 USA
来源
FROM SOUNDS TO MUSIC AND EMOTIONS | 2013年 / 7900卷
关键词
emotion; music emotion recognition; features; acoustic features; machine learning; invariance; MODE; RESPONSES; TEMPO;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
While the organization of music in terms of emotional affect is a natural process for humans, quantifying it empirically proves to be a very difficult task. Consequently, no acoustic feature (or combination thereof) has emerged as the optimal representation for musical emotion recognition. Due to the subjective nature of emotion, determining whether an acoustic feature domain is informative requires evaluation by human subjects. In this work, we seek to perceptually evaluate two of the most commonly used features in music information retrieval: mel-frequency cepstral coefficients and chroma. Furthermore, to identify emotion-informative feature domains, we explore which musical features are most relevant in determining emotion perceptually, and which acoustic feature domains are most variant or invariant to those changes. Finally, given our collected perceptual data, we conduct an extensive computational experiment for emotion prediction accuracy on a large number of acoustic feature domains, investigating pairwise prediction both in the context of a general corpus as well as in the context of a corpus that is constrained to contain only specific musical feature transformations.
引用
收藏
页码:278 / 300
页数:23
相关论文
共 50 条
[31]   Perceptual features for automatic speech recognition in noisy environments [J].
Haque, Serajul ;
Togneri, Roberto ;
Zaknich, Anthony .
SPEECH COMMUNICATION, 2009, 51 (01) :58-75
[32]   Audio Emotion Recognition using Machine Learning to support Sound Design [J].
Cunningham, Stuart ;
Ridley, Harrison ;
Weinel, Jonathan ;
Picking, Richard .
PROCEEDINGS OF THE 14TH INTERNATIONAL AUDIO MOSTLY CONFERENCE, AM 2019: A Journey in Sound, 2019, :116-123
[33]   An Active Learning Paradigm for Online Audio-Visual Emotion Recognition [J].
Kansizoglou, Ioannis ;
Bampis, Loukas ;
Gasteratos, Antonios .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (02) :756-768
[34]   Emotion Recognition from Audio Signals using Support Vector Machine [J].
Sinith, M. S. ;
Aswathi, E. ;
Deepa, T. M. ;
Shameema, C. P. ;
Rajan, Shiny .
PROCEEDINGS OF THE 2015 IEEE RECENT ADVANCES IN INTELLIGENT COMPUTATIONAL SYSTEMS (RAICS), 2015, :139-144
[35]   Using the Fisher Vector Representation for Audio-based Emotion Recognition [J].
Gosztolya, Gabor .
ACTA POLYTECHNICA HUNGARICA, 2020, 17 (06) :7-23
[36]   ISLA: Temporal Segmentation and Labeling for Audio-Visual Emotion Recognition [J].
Kim, Yelin ;
Provost, Emily Mower .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2019, 10 (02) :196-208
[37]   MANDARIN AUDIO-VISUAL SPEECH RECOGNITION WITH EFFECTS TO THE NOISE AND EMOTION [J].
Pao, Tsang-Long ;
Liao, Wen-Yuan ;
Chen, Yu-Te ;
Wu, Tsan-Nung .
INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (02) :711-723
[38]   A multimodal hierarchical approach to speech emotion recognition from audio and text [J].
Singh, Prabhav ;
Srivastava, Ridam ;
Rana, K. P. S. ;
Kumar, Vineet .
KNOWLEDGE-BASED SYSTEMS, 2021, 229
[39]   Facial, vocal and musical emotion recognition is altered in paranoid schizophrenic patients [J].
Weisgerber, Anne ;
Vermeulen, Nicolas ;
Peretz, Isabelle ;
Samson, Severine ;
Philippot, Pierre ;
Maurage, Pierre ;
D'Aoust, Catherine De Graeuwe ;
De Jaegere, Aline ;
Delatte, Benoit ;
Gillain, Benoit ;
De Longueville, Xavier ;
Constant, Eric .
PSYCHIATRY RESEARCH, 2015, 229 (1-2) :188-193
[40]   Exploring the Causal Relationships Between Musical Features and Physiological Indicators of Emotion [J].
Huang, Wei ;
Bortz, Brennon ;
Knapp, R. Benjamin .
2015 INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2015, :560-566