A GAI-based multi-scale convolution and attention mechanism model for music emotion recognition and recommendation from physiological data

被引:3
作者
Han, Xiao [1 ]
Chen, Fuyang [1 ]
Ban, Junrong [2 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing 210000, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Coll Aeronaut, Nanjing 210000, Peoples R China
关键词
Music emotion recognition; Physiological signals; Three-dimensional emotion model; Music recommendation; Multi-scale parallel convolution; GAI-based-attention mechanism; NEURAL-NETWORK; CLASSIFICATION; EXPERIENCE; VALENCE; AROUSAL;
D O I
10.1016/j.asoc.2024.112034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the subjectivity of emotions and the limited number of emotion categories, existing deep learning models require assistance to achieve objective, accurate, and flexible personalized music emotion recommendations. This paper introduces a deep learning approach that combines Generative Artificial Intelligence (GAI) and explicitly leverages physiological indicators to enhance the model's intelligence, versatility, and automation. Physiological indicators such as Heart Rate Variability (HRV) and Galvanic Skin Response (GSR) can be measured using sensors placed on the body's surface, providing more precise information about human emotional changes. This research employs a three-dimensional emotion model, including the tension-arousal axis, energy-arousal axis, and valence axis, to explain the correlation and accuracy between music data and emotions. Based on this, a music emotion classifier is designed, incorporating GAI algorithms to recommend music by matching users' physiological and emotional types with the emotional features of music. The classifier uses Mel-Frequency Cepstral Coefficients (MFCC) to transform audio into Mel-spectrogram as input features. The music emotion selection module adopts a GAI framework of Variational Autoencoder (VAE) and integrates multiscale parallel convolution and attention mechanism modules. Experimental results demonstrate that this approach is competitive compared to existing deep learning architectures on PMEmo, RAVDESS, and Soundtrack datasets. Furthermore, due to GAI's efficient classification capability, this model is suitable for resourceconstrained mobile devices and other smart devices. The results of this study can be applied to emotion-based music recommendation systems, contributing to emotional interventions and improving the performance of exercise and music therapy.
引用
收藏
页数:16
相关论文
共 50 条
[21]   Improvement of Multimodal Emotion Recognition Based on Temporal-Aware Bi-Direction Multi-Scale Network and Multi-Head Attention Mechanisms [J].
Wu, Yuezhou ;
Zhang, Siling ;
Li, Pengfei .
APPLIED SCIENCES-BASEL, 2024, 14 (08)
[22]   Remote Sensing Image Retrieval Based on Multi-scale Pooling and Norm Attention Mechanism [J].
Ge, Yun ;
Ma, Lin ;
Ye, Famao ;
Chu, Jun .
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2022, 44 (02) :543-551
[23]   A New Multi-Scale Convolutional Model Based on Multiple Attention for Image Classification [J].
Yang, Yadong ;
Xu, Chengji ;
Dong, Feng ;
Wang, Xiaofeng .
APPLIED SCIENCES-BASEL, 2020, 10 (01)
[24]   Traffic Sign Recognition Algorithm Based on Multi-scale Convolution and Weighted-Hybird Loss Function [J].
Lan, Zhengjie ;
Wang, Lie ;
Su, Zhiming .
2021 INTERNATIONAL CONFERENCE ON BIG DATA ENGINEERING AND EDUCATION (BDEE 2021), 2021, :84-89
[25]   Noise suppression method based on multi-scale Dilated Convolution Network in desert seismic data [J].
Li, Yue ;
Wang, Yuying ;
Wu, Ning .
COMPUTERS & GEOSCIENCES, 2021, 156 (156)
[26]   Person re-identification based on multi-scale feature fusion and multi-attention mechanism [J].
Pu, Jiacheng ;
Zou, Wei .
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (01) :243-253
[27]   Avionics Module Fault Diagnosis Algorithm Based on Hybrid Attention Adaptive Multi-Scale Temporal Convolution Network [J].
Du, Qiliang ;
Sheng, Mingde ;
Yu, Lubin ;
Zhou, Zhenwei ;
Tian, Lianfang ;
He, Shilie .
ENTROPY, 2024, 26 (07)
[28]   Attention-Based Multi-Scale Prediction Network for Time-Series Data [J].
Li, Junjie ;
Zhu, Lin ;
Zhang, Yong ;
Guo, Da ;
Xia, Xingwen .
CHINA COMMUNICATIONS, 2022, 19 (05) :286-301
[29]   Detection of Rice Pests Based on Self-Attention Mechanism and Multi-Scale Feature Fusion [J].
Hu, Yuqi ;
Deng, Xiaoling ;
Lan, Yubin ;
Chen, Xin ;
Long, Yongbing ;
Liu, Cunjia .
INSECTS, 2023, 14 (03)
[30]   Object Detection of Remote Sensing Image Based on Multi-Scale Feature Fusion and Attention Mechanism [J].
Du, Zuoqiang ;
Liang, Yuan .
IEEE ACCESS, 2024, 12 :8619-8632