Music emotion recognition using recurrent neural networks and pretrained models

被引：15

作者：

Grekow, Jacek ^{[1
]}

机构：

[1] Bialystok Tech Univ, Fac Comp Sci, Wiejska 45A, PL-15351 Bialystok, Poland

来源：

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS | 2021年 / 57卷 / 03期

关键词：

Emotion detection; Audio features; Sequential data; Recurrent neural networks; CLASSIFICATION; HINDI;

D O I：

10.1007/s10844-021-00658-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The article presents conducted experiments using recurrent neural networks for emotion detection in musical segments. Trained regression models were used to predict the continuous values of emotions on the axes of Russell's circumplex model. A process of audio feature extraction and creating sequential data for learning networks with long short-term memory (LSTM) units is presented. Models were implemented using the WekaDeeplearning4j package and a number of experiments were carried out with data with different sets of features and varying segmentation. The usefulness of dividing the data into sequences as well as the point of using recurrent networks to recognize emotions in music, the results of which have even exceeded the SVM algorithm for regression, were demonstrated. The author analyzed the effect of the network structure and the set of used features on the results of the regressors recognizing values on two axes of the emotion model: arousal and valence. Finally, the use of a pretrained model for processing audio features and training a recurrent network with new sequences of features is presented.

引用

页码：531 / 546

页数：16

共 28 条

[1] Developing a benchmark for emotional analysis of music
Aljanaki, Anna
Yang, Yi-Hsuan
Soleymani, Mohammad
[J]. PLOS ONE, 2017, 12 (03):
[2] [Anonymous], 2012, PROC ISMIR
[3] EMOTION IN MOTION: INVESTIGATING THE TIME-COURSE OF EMOTIONAL JUDGMENTS OF MUSICAL STIMULI
Bachorik, Justin Pierre
Bangert, Marc
Loui, Psyche
Larke, Kevin
Berger, Jeff
Rowe, Robert
Schlaug, Gottfried
[J]. MUSIC PERCEPTION, 2009, 26 (04): : 355 - 364
[4] Bogdanov D., 2013, Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), P493, DOI [DOI 10.5281/ZENODO.1415016, 10.5281/zenodo.1415016, DOI 10.1145/2502081.2502229]
[5] Choi K., 2017, ARXIV170309179
[6] Chowdhury S., 2019, P INT SOC MUS INF RE, P237
[7] Coutinho E., 2015, WORKING NOTES P MEDI, P1
[8] Delbouys Remi., 2018, Proceedings of the 19th International Society for Music Information Retrieval Conference, P370, DOI [DOI 10.5281/ZENODO.1492427, 10.5281/zenodo.1492427]
[9] Gers FA, 1999, IEE CONF PUBL, P850, DOI [10.1162/089976600300015015, 10.1049/cp:19991218]
[10] Grekow J., 2018, CONTENT BASED MUSIC, P13, DOI DOI 10.1007/978-3-319-70609-2

← 1 2 3 →