A GAI-based multi-scale convolution and attention mechanism model for music emotion recognition and recommendation from physiological data

被引:1
作者
Han, Xiao [1 ]
Chen, Fuyang [1 ]
Ban, Junrong [2 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing 210000, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Coll Aeronaut, Nanjing 210000, Peoples R China
关键词
Music emotion recognition; Physiological signals; Three-dimensional emotion model; Music recommendation; Multi-scale parallel convolution; GAI-based-attention mechanism; NEURAL-NETWORK; CLASSIFICATION; EXPERIENCE; VALENCE; AROUSAL;
D O I
10.1016/j.asoc.2024.112034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the subjectivity of emotions and the limited number of emotion categories, existing deep learning models require assistance to achieve objective, accurate, and flexible personalized music emotion recommendations. This paper introduces a deep learning approach that combines Generative Artificial Intelligence (GAI) and explicitly leverages physiological indicators to enhance the model's intelligence, versatility, and automation. Physiological indicators such as Heart Rate Variability (HRV) and Galvanic Skin Response (GSR) can be measured using sensors placed on the body's surface, providing more precise information about human emotional changes. This research employs a three-dimensional emotion model, including the tension-arousal axis, energy-arousal axis, and valence axis, to explain the correlation and accuracy between music data and emotions. Based on this, a music emotion classifier is designed, incorporating GAI algorithms to recommend music by matching users' physiological and emotional types with the emotional features of music. The classifier uses Mel-Frequency Cepstral Coefficients (MFCC) to transform audio into Mel-spectrogram as input features. The music emotion selection module adopts a GAI framework of Variational Autoencoder (VAE) and integrates multiscale parallel convolution and attention mechanism modules. Experimental results demonstrate that this approach is competitive compared to existing deep learning architectures on PMEmo, RAVDESS, and Soundtrack datasets. Furthermore, due to GAI's efficient classification capability, this model is suitable for resourceconstrained mobile devices and other smart devices. The results of this study can be applied to emotion-based music recommendation systems, contributing to emotional interventions and improving the performance of exercise and music therapy.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Traffic Sign Recognition Algorithm Based on Multi-scale Convolution and Weighted-Hybird Loss Function
    Lan, Zhengjie
    Wang, Lie
    Su, Zhiming
    2021 INTERNATIONAL CONFERENCE ON BIG DATA ENGINEERING AND EDUCATION (BDEE 2021), 2021, : 84 - 89
  • [22] Noise suppression method based on multi-scale Dilated Convolution Network in desert seismic data
    Li, Yue
    Wang, Yuying
    Wu, Ning
    COMPUTERS & GEOSCIENCES, 2021, 156 (156)
  • [23] Person re-identification based on multi-scale feature fusion and multi-attention mechanism
    Pu, Jiacheng
    Zou, Wei
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (01) : 243 - 253
  • [24] Avionics Module Fault Diagnosis Algorithm Based on Hybrid Attention Adaptive Multi-Scale Temporal Convolution Network
    Du, Qiliang
    Sheng, Mingde
    Yu, Lubin
    Zhou, Zhenwei
    Tian, Lianfang
    He, Shilie
    ENTROPY, 2024, 26 (07)
  • [25] Attention-Based Multi-Scale Prediction Network for Time-Series Data
    Li, Junjie
    Zhu, Lin
    Zhang, Yong
    Guo, Da
    Xia, Xingwen
    CHINA COMMUNICATIONS, 2022, 19 (05) : 286 - 301
  • [26] Detection of Rice Pests Based on Self-Attention Mechanism and Multi-Scale Feature Fusion
    Hu, Yuqi
    Deng, Xiaoling
    Lan, Yubin
    Chen, Xin
    Long, Yongbing
    Liu, Cunjia
    INSECTS, 2023, 14 (03)
  • [27] Object Detection of Remote Sensing Image Based on Multi-Scale Feature Fusion and Attention Mechanism
    Du, Zuoqiang
    Liang, Yuan
    IEEE ACCESS, 2024, 12 : 8619 - 8632
  • [28] Wafer map defect recognition based on multi-scale feature fusion and attention spatial pyramid pooling
    Chen, Shouhong
    Huang, Zhentao
    Wang, Tao
    Hou, Xingna
    Ma, Jun
    JOURNAL OF INTELLIGENT MANUFACTURING, 2025, 36 (01) : 271 - 284
  • [29] A rolling bearing fault diagnosis method for imbalanced data based on multi-scale self-attention mechanism and novel loss function
    Qiang Ruiru
    Zhao Xiaoqiang
    INSIGHT, 2024, 66 (11) : 690 - 701
  • [30] Multi-Scale Residual U-Net Fundus Blood Vessel Segmentation Based on Attention Mechanism
    Zhao Feng
    Zhong Beibei
    Liu Hanqiang
    LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (18)