A GAI-based multi-scale convolution and attention mechanism model for music emotion recognition and recommendation from physiological data

被引:1
|
作者
Han, Xiao [1 ]
Chen, Fuyang [1 ]
Ban, Junrong [2 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing 210000, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Coll Aeronaut, Nanjing 210000, Peoples R China
关键词
Music emotion recognition; Physiological signals; Three-dimensional emotion model; Music recommendation; Multi-scale parallel convolution; GAI-based-attention mechanism; NEURAL-NETWORK; CLASSIFICATION; EXPERIENCE; VALENCE; AROUSAL;
D O I
10.1016/j.asoc.2024.112034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the subjectivity of emotions and the limited number of emotion categories, existing deep learning models require assistance to achieve objective, accurate, and flexible personalized music emotion recommendations. This paper introduces a deep learning approach that combines Generative Artificial Intelligence (GAI) and explicitly leverages physiological indicators to enhance the model's intelligence, versatility, and automation. Physiological indicators such as Heart Rate Variability (HRV) and Galvanic Skin Response (GSR) can be measured using sensors placed on the body's surface, providing more precise information about human emotional changes. This research employs a three-dimensional emotion model, including the tension-arousal axis, energy-arousal axis, and valence axis, to explain the correlation and accuracy between music data and emotions. Based on this, a music emotion classifier is designed, incorporating GAI algorithms to recommend music by matching users' physiological and emotional types with the emotional features of music. The classifier uses Mel-Frequency Cepstral Coefficients (MFCC) to transform audio into Mel-spectrogram as input features. The music emotion selection module adopts a GAI framework of Variational Autoencoder (VAE) and integrates multiscale parallel convolution and attention mechanism modules. Experimental results demonstrate that this approach is competitive compared to existing deep learning architectures on PMEmo, RAVDESS, and Soundtrack datasets. Furthermore, due to GAI's efficient classification capability, this model is suitable for resourceconstrained mobile devices and other smart devices. The results of this study can be applied to emotion-based music recommendation systems, contributing to emotional interventions and improving the performance of exercise and music therapy.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Learning multi-scale features for speech emotion recognition with connection attention mechanism
    Chen, Zengzhao
    Li, Jiawen
    Liu, Hai
    Wang, Xuyang
    Wang, Hu
    Zheng, Qiuyu
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 214
  • [2] A novel multi-scale convolution model based on multi-dilation rates and multi-attention mechanism for mechanical fault diagnosis
    Chu, Caiyuan
    Ge, Yongxin
    Qian, Quan
    Hua, Boyu
    Guo, Jie
    DIGITAL SIGNAL PROCESSING, 2022, 122
  • [3] A multi-modal and multi-scale emotion-enhanced inference model based on fuzzy recognition
    Yu, Yan
    Qiu, Dong
    Yan, Ruiteng
    COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (02) : 1071 - 1084
  • [4] Emotion recognition from EEG based on multi-task learning with capsule network and attention mechanism
    Li, Chang
    Wang, Bin
    Zhang, Silin
    Liu, Yu
    Song, Rencheng
    Cheng, Juan
    Chen, Xun
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 143
  • [5] EEG emotion recognition based on the attention mechanism and pre-trained convolution capsule network
    Liu, Shuaiqi
    Wang, Zeyao
    An, Yanling
    Zhao, Jie
    Zhao, Yingying
    Zhang, Yu-Dong
    KNOWLEDGE-BASED SYSTEMS, 2023, 265
  • [6] An Improved U-Net Model Based on Multi-Scale Input and Attention Mechanism: Application for Recognition of Chinese Cabbage and Weed
    Ma, Zhongyang
    Wang, Gang
    Yao, Jurong
    Huang, Dongyan
    Tan, Hewen
    Jia, Honglei
    Zou, Zhaobo
    SUSTAINABILITY, 2023, 15 (07)
  • [7] SwinMin: A mineral recognition model incorporating convolution and multi-scale contexts into swin transformer
    Jia, Liqin
    Chen, Feng
    Yang, Mei
    Meng, Fang
    He, Mingyue
    Liu, Hongmin
    COMPUTERS & GEOSCIENCES, 2024, 184
  • [8] Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism
    Li J.-Y.
    Yang J.
    Kong B.
    Wang C.
    Zhang L.
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2021, 29 (06): : 1448 - 1458
  • [9] Research on cassava disease classification using the multi-scale fusion model based on EfficientNet and attention mechanism
    Liu, Mingxin
    Liang, Haofeng
    Hou, Mingxin
    FRONTIERS IN PLANT SCIENCE, 2022, 13
  • [10] A Dynamic Multi-Scale Convolution Model for Face Recognition Using Event-Related Potentials
    Li, Shengkai
    Zhang, Tonglin
    Yang, Fangmei
    Li, Xian
    Wang, Ziyang
    Zhao, Dongjie
    SENSORS, 2024, 24 (13)