Machine Learning and Deep Learning in Music Emotion Recognition: A Comprehensive Survey

被引：0

作者：

Dutta, Jumpi ^{[1
]}

Chanda, Dipankar ^{[1
]}

机构：

[1] Assam Engn Coll, Dept Elect Engn, Gauhati, Assam, India

来源：

INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES | 2025年 / 10卷 / 04期

关键词：

Music emotion recognition; Feature extraction; Machine learning; Deep learning; Human-computer-interaction; CLASSIFICATION; MODEL; EXPRESSION;

D O I：

10.33889/IJMEMS.2025.10.4.047

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Music can express and influence a wide range of emotional states and feelings in humans. The development of a system for recognizing emotions based on music analysis has generated significant interest among academic and industrial communities due to its applications in various fields such as human-machine interaction, music recommendation systems, music therapy, and so on. Music emotion recognition (MER) is the process of analysing and classifying the affective states conveyed by a piece of music. A survey of existing work on emotional music processing is indeed very helpful for carrying out further research in the field of music emotion recognition. Due to the importance of emotion recognition in Music Information Retrieval (MIR) research, a comprehensive survey is provided in this paper with a detailed study of emotion models, features, and various music databases. This paper emphasizes the machine learning and deep learning approaches used for MER to extract emotions from music. Finally, the paper is summarized with some possible future research directions.

引用

页码：977 / 999

页数：23

共 100 条

[41] DEAP: A Database for Emotion Analysis Using Physiological Signals [J].

Koelstra, Sander ;

Muhl, Christian ;

Soleymani, Mohammad ;

Lee, Jong-Seok ;

Yazdani, Ashkan ;

Ebrahimi, Touradj ;

Pun, Thierry ;

Nijholt, Anton ;

Patras, Ioannis .

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2012, 3 (01) :18-31

[42] Song Emotion Recognition Using Music Genre Information [J].

Koutras, Athanasios .

SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 :669-679

[43]

Larsen R.J., 1992, REV PERSONALITY SOCI, V13, P25, DOI DOI 10.1177/0004867420943943

[44]

Lerch A., 2012, An introduction to audio content analysis: Applications in signal processing and music informatics, DOI DOI 10.1002/9781118393550

[45] RETRACTED: Multimodal Emotion Recognition Model Based on a Deep Neural Network with Multiobjective Optimization (Retracted Article) [J].

Li, Mingyong ;

Qiu, Xue ;

Peng, Shuang ;

Tang, Lirong ;

Li, Qiqi ;

Yang, Wenhui ;

Ma, Yan .

WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021

[46]

Li XX, 2016, IEEE INT CON MULTI

[47] A Review on Speech Emotion Recognition Using Deep Learning and Attention Mechanism [J].

Lieskovska, Eva ;

Jakubec, Maros ;

Jarina, Roman ;

Chmulik, Michal .

ELECTRONICS, 2021, 10 (10)

[48]

Liu X, 2017, Arxiv, DOI [arXiv:1704.05665, DOI 10.48550/ARXIV.1704.05665]

[49] The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English [J].

Livingstone, Steven R. ;

Russo, Frank A. .

PLOS ONE, 2018, 13 (05)

[50]

Louro PL, 2025, Arxiv, DOI [arXiv:2407.06060, arXiv:2407.06060, 10.48550/arxiv.2407.06060, DOI 10.48550/ARXIV.2407.06060]

← 1 2 3 4 5 6 7 8 9 10 →