A GAI-based multi-scale convolution and attention mechanism model for music emotion recognition and recommendation from physiological data

被引:3
作者
Han, Xiao [1 ]
Chen, Fuyang [1 ]
Ban, Junrong [2 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing 210000, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Coll Aeronaut, Nanjing 210000, Peoples R China
关键词
Music emotion recognition; Physiological signals; Three-dimensional emotion model; Music recommendation; Multi-scale parallel convolution; GAI-based-attention mechanism; NEURAL-NETWORK; CLASSIFICATION; EXPERIENCE; VALENCE; AROUSAL;
D O I
10.1016/j.asoc.2024.112034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the subjectivity of emotions and the limited number of emotion categories, existing deep learning models require assistance to achieve objective, accurate, and flexible personalized music emotion recommendations. This paper introduces a deep learning approach that combines Generative Artificial Intelligence (GAI) and explicitly leverages physiological indicators to enhance the model's intelligence, versatility, and automation. Physiological indicators such as Heart Rate Variability (HRV) and Galvanic Skin Response (GSR) can be measured using sensors placed on the body's surface, providing more precise information about human emotional changes. This research employs a three-dimensional emotion model, including the tension-arousal axis, energy-arousal axis, and valence axis, to explain the correlation and accuracy between music data and emotions. Based on this, a music emotion classifier is designed, incorporating GAI algorithms to recommend music by matching users' physiological and emotional types with the emotional features of music. The classifier uses Mel-Frequency Cepstral Coefficients (MFCC) to transform audio into Mel-spectrogram as input features. The music emotion selection module adopts a GAI framework of Variational Autoencoder (VAE) and integrates multiscale parallel convolution and attention mechanism modules. Experimental results demonstrate that this approach is competitive compared to existing deep learning architectures on PMEmo, RAVDESS, and Soundtrack datasets. Furthermore, due to GAI's efficient classification capability, this model is suitable for resourceconstrained mobile devices and other smart devices. The results of this study can be applied to emotion-based music recommendation systems, contributing to emotional interventions and improving the performance of exercise and music therapy.
引用
收藏
页数:16
相关论文
共 50 条
[41]   Graph Convolutional Neural Network with Multi-Scale Attention Mechanism for EEG-Based Motion Imagery Classification [J].
Zhu, Jun ;
Liu, Qingshan ;
Xu, Chentao .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (14)
[42]   A Multi-Scale Fusion Convolutional Neural Network Based on Attention Mechanism for the Visualization Analysis of EEG Signals Decoding [J].
Li, Donglin ;
Xu, Jiacan ;
Wang, Jianhui ;
Fang, Xiaoke ;
Ji, Ying .
IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2020, 28 (12) :2615-2626
[43]   DeFRCN-MAM: DeFRCN and multi-scale attention mechanism-based industrial defect detection method [J].
Zheng, Tong ;
Sa, Liangbing ;
Yu, Chongchong ;
Song, Aibin .
APPLIED ARTIFICIAL INTELLIGENCE, 2024, 38 (01)
[44]   AML-Net: Attention-based multi-scale lightweight model for brain tumour segmentation in internet of medical things [J].
Zeeshan Aslam, Muhammad ;
Raza, Basit ;
Faheem, Muhammad ;
Raza, Aadil .
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2024,
[45]   HRS-Net: A Hybrid Multi-Scale Network Model Based on Convolution and Transformers for Multi-Class Retinal Disease Classification [J].
Yang, Hai ;
Chen, Li ;
Cao, Junyang ;
Wang, Juan .
IEEE ACCESS, 2024, 12 :144219-144229
[46]   Multi-label emotion recognition from Indian classical music using gradient descent SNN model [J].
Tiple, Bhavana ;
Patwardhan, Manasi .
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (06) :8853-8870
[47]   Human action recognition based on multi-scale feature maps from depth video sequences [J].
Li, Chang ;
Huang, Qian ;
Li, Xing ;
Wu, Qianhan .
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (21-23) :32111-32130
[48]   Multi-scale attention-based pseudo-3D convolution neural network for Alzheimer's disease diagnosis using structural MRI [J].
Pei, Zhao ;
Wan, Zhiyang ;
Zhang, Yanning ;
Wang, Miao ;
Leng, Chengcai ;
Yang, Yee-Hong .
PATTERN RECOGNITION, 2022, 131
[49]   MULTI-SCALE BASED EXTRACION OF VEGETATION FROM TERRESTRIAL LiDAR DATA FOR ASSESSING LOCAL LANDSCAPE [J].
Wakita, T. ;
Susaki, J. .
PIA15+HRIGI15 - JOINT ISPRS CONFERENCE, VOL. II, 2015, 2-3 (W4) :263-270
[50]   Inter-patient congestive heart failure automatic recognition using attention-based multi-scale convolutional neural network [J].
Sun, Meiqi ;
Si, Yujuan ;
Yang, Weiyi ;
Fan, Wei ;
Zhou, Lin .
MEASUREMENT, 2023, 218