A GAI-based multi-scale convolution and attention mechanism model for music emotion recognition and recommendation from physiological data

被引:3
作者
Han, Xiao [1 ]
Chen, Fuyang [1 ]
Ban, Junrong [2 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing 210000, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Coll Aeronaut, Nanjing 210000, Peoples R China
关键词
Music emotion recognition; Physiological signals; Three-dimensional emotion model; Music recommendation; Multi-scale parallel convolution; GAI-based-attention mechanism; NEURAL-NETWORK; CLASSIFICATION; EXPERIENCE; VALENCE; AROUSAL;
D O I
10.1016/j.asoc.2024.112034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the subjectivity of emotions and the limited number of emotion categories, existing deep learning models require assistance to achieve objective, accurate, and flexible personalized music emotion recommendations. This paper introduces a deep learning approach that combines Generative Artificial Intelligence (GAI) and explicitly leverages physiological indicators to enhance the model's intelligence, versatility, and automation. Physiological indicators such as Heart Rate Variability (HRV) and Galvanic Skin Response (GSR) can be measured using sensors placed on the body's surface, providing more precise information about human emotional changes. This research employs a three-dimensional emotion model, including the tension-arousal axis, energy-arousal axis, and valence axis, to explain the correlation and accuracy between music data and emotions. Based on this, a music emotion classifier is designed, incorporating GAI algorithms to recommend music by matching users' physiological and emotional types with the emotional features of music. The classifier uses Mel-Frequency Cepstral Coefficients (MFCC) to transform audio into Mel-spectrogram as input features. The music emotion selection module adopts a GAI framework of Variational Autoencoder (VAE) and integrates multiscale parallel convolution and attention mechanism modules. Experimental results demonstrate that this approach is competitive compared to existing deep learning architectures on PMEmo, RAVDESS, and Soundtrack datasets. Furthermore, due to GAI's efficient classification capability, this model is suitable for resourceconstrained mobile devices and other smart devices. The results of this study can be applied to emotion-based music recommendation systems, contributing to emotional interventions and improving the performance of exercise and music therapy.
引用
收藏
页数:16
相关论文
共 50 条
[31]   A Multi-scale Residual Network Based on the Multi-head Attention Mechanism for Motor Imagery EEG Decoding [J].
Li, Ketong ;
Liu, Xiaodong ;
Chen, Qian ;
Chen, Peng .
2024 4TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AUTOMATION, ROBOTICS AND CONTROL ENGINEERING, IARCE, 2024, :241-244
[32]   MSSTNet: Multi-scale facial videos pulse extraction network based on separable spatiotemporal convolution and dimension separable attention [J].
Zhao C. ;
Wang H. ;
Feng Y. .
Virtual Reality and Intelligent Hardware, 2023, 5 (02) :124-141
[33]   Wafer map defect recognition based on multi-scale feature fusion and attention spatial pyramid pooling [J].
Chen, Shouhong ;
Huang, Zhentao ;
Wang, Tao ;
Hou, Xingna ;
Ma, Jun .
JOURNAL OF INTELLIGENT MANUFACTURING, 2025, 36 (01) :271-284
[34]   Data imbalance fault diagnosis method based on an ensemble multi-scale convolutional attention network [J].
He, Jialong ;
Huang, Wentao ;
Liu, Yan ;
Qian, Chenhui ;
Ma, Chi ;
Gao, Wanfu ;
Jin, Xingze .
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2025, 236
[35]   A rolling bearing fault diagnosis method for imbalanced data based on multi-scale self-attention mechanism and novel loss function [J].
Qiang Ruiru ;
Zhao Xiaoqiang .
INSIGHT, 2024, 66 (11) :690-701
[36]   Multi-Scale Residual U-Net Fundus Blood Vessel Segmentation Based on Attention Mechanism [J].
Zhao Feng ;
Zhong Beibei ;
Liu Hanqiang .
LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (18)
[37]   MsF-AT: A Study on Ship SAR Image Classification Based on Multi-Scale Feature and Attention Mechanism [J].
Zheng, Jianli ;
Cao, Jianjun ;
Hu, Xin .
IEEE ACCESS, 2025, 13 :55467-55475
[38]   Chinese agricultural diseases and pests named entity recognition with multi-scale local context features and self-attention mechanism [J].
Guo, Xuchao ;
Zhou, Han ;
Su, Jie ;
Hao, Xia ;
Tang, Zhan ;
Diao, Lei ;
Li, Lin .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2020, 179
[39]   Multi-Scale Feature Fusion for Coal-Rock Recognition Based on Completed Local Binary Pattern and Convolution Neural Network [J].
Liu, Xiaoyang ;
Jing, Wei ;
Zhou, Mingxuan ;
Li, Yuxing .
ENTROPY, 2019, 21 (06)
[40]   Semantic Segmentation of Urban Airborne LiDAR Point Clouds Based on Fusion Attention Mechanism and Multi-Scale Features [J].
Wang, Jingxue ;
Li, Huan ;
Xu, Zhenghui ;
Xie, Xiao .
REMOTE SENSING, 2023, 15 (21)