Music emotion recognition using recurrent neural networks and pretrained models

被引:15
作者
Grekow, Jacek [1 ]
机构
[1] Bialystok Tech Univ, Fac Comp Sci, Wiejska 45A, PL-15351 Bialystok, Poland
关键词
Emotion detection; Audio features; Sequential data; Recurrent neural networks; CLASSIFICATION; HINDI;
D O I
10.1007/s10844-021-00658-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The article presents conducted experiments using recurrent neural networks for emotion detection in musical segments. Trained regression models were used to predict the continuous values of emotions on the axes of Russell's circumplex model. A process of audio feature extraction and creating sequential data for learning networks with long short-term memory (LSTM) units is presented. Models were implemented using the WekaDeeplearning4j package and a number of experiments were carried out with data with different sets of features and varying segmentation. The usefulness of dividing the data into sequences as well as the point of using recurrent networks to recognize emotions in music, the results of which have even exceeded the SVM algorithm for regression, were demonstrated. The author analyzed the effect of the network structure and the set of used features on the results of the regressors recognizing values on two axes of the emotion model: arousal and valence. Finally, the use of a pretrained model for processing audio features and training a recurrent network with new sequences of features is presented.
引用
收藏
页码:531 / 546
页数:16
相关论文
共 28 条
  • [1] Developing a benchmark for emotional analysis of music
    Aljanaki, Anna
    Yang, Yi-Hsuan
    Soleymani, Mohammad
    [J]. PLOS ONE, 2017, 12 (03):
  • [2] [Anonymous], 2012, PROC ISMIR
  • [3] EMOTION IN MOTION: INVESTIGATING THE TIME-COURSE OF EMOTIONAL JUDGMENTS OF MUSICAL STIMULI
    Bachorik, Justin Pierre
    Bangert, Marc
    Loui, Psyche
    Larke, Kevin
    Berger, Jeff
    Rowe, Robert
    Schlaug, Gottfried
    [J]. MUSIC PERCEPTION, 2009, 26 (04): : 355 - 364
  • [4] Bogdanov D., 2013, Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), P493, DOI [DOI 10.5281/ZENODO.1415016, 10.5281/zenodo.1415016, DOI 10.1145/2502081.2502229]
  • [5] Choi K., 2017, ARXIV170309179
  • [6] Chowdhury S., 2019, P INT SOC MUS INF RE, P237
  • [7] Coutinho E., 2015, WORKING NOTES P MEDI, P1
  • [8] Delbouys Remi., 2018, Proceedings of the 19th International Society for Music Information Retrieval Conference, P370, DOI [DOI 10.5281/ZENODO.1492427, 10.5281/zenodo.1492427]
  • [9] Gers FA, 1999, IEE CONF PUBL, P850, DOI [10.1162/089976600300015015, 10.1049/cp:19991218]
  • [10] Grekow J., 2018, CONTENT BASED MUSIC, P13, DOI DOI 10.1007/978-3-319-70609-2