A GAI-based multi-scale convolution and attention mechanism model for music emotion recognition and recommendation from physiological data

被引：1

作者：

Han, Xiao ^{[1
]}

Chen, Fuyang ^{[1
]}

Ban, Junrong ^{[2
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing 210000, Peoples R China

[2] Nanjing Univ Aeronaut & Astronaut, Coll Aeronaut, Nanjing 210000, Peoples R China

来源：

APPLIED SOFT COMPUTING | 2024年 / 164卷

关键词：

Music emotion recognition; Physiological signals; Three-dimensional emotion model; Music recommendation; Multi-scale parallel convolution; GAI-based-attention mechanism; NEURAL-NETWORK; CLASSIFICATION; EXPERIENCE; VALENCE; AROUSAL;

D O I：

10.1016/j.asoc.2024.112034

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to the subjectivity of emotions and the limited number of emotion categories, existing deep learning models require assistance to achieve objective, accurate, and flexible personalized music emotion recommendations. This paper introduces a deep learning approach that combines Generative Artificial Intelligence (GAI) and explicitly leverages physiological indicators to enhance the model's intelligence, versatility, and automation. Physiological indicators such as Heart Rate Variability (HRV) and Galvanic Skin Response (GSR) can be measured using sensors placed on the body's surface, providing more precise information about human emotional changes. This research employs a three-dimensional emotion model, including the tension-arousal axis, energy-arousal axis, and valence axis, to explain the correlation and accuracy between music data and emotions. Based on this, a music emotion classifier is designed, incorporating GAI algorithms to recommend music by matching users' physiological and emotional types with the emotional features of music. The classifier uses Mel-Frequency Cepstral Coefficients (MFCC) to transform audio into Mel-spectrogram as input features. The music emotion selection module adopts a GAI framework of Variational Autoencoder (VAE) and integrates multiscale parallel convolution and attention mechanism modules. Experimental results demonstrate that this approach is competitive compared to existing deep learning architectures on PMEmo, RAVDESS, and Soundtrack datasets. Furthermore, due to GAI's efficient classification capability, this model is suitable for resourceconstrained mobile devices and other smart devices. The results of this study can be applied to emotion-based music recommendation systems, contributing to emotional interventions and improving the performance of exercise and music therapy.

引用

页数：16

共 50 条

[1] Learning multi-scale features for speech emotion recognition with connection attention mechanism
Chen, Zengzhao
Li, Jiawen
Liu, Hai
Wang, Xuyang
Wang, Hu
Zheng, Qiuyu
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 214
[2] A novel multi-scale convolution model based on multi-dilation rates and multi-attention mechanism for mechanical fault diagnosis
Chu, Caiyuan
Ge, Yongxin
Qian, Quan
Hua, Boyu
Guo, Jie
DIGITAL SIGNAL PROCESSING, 2022, 122
[3] A multi-modal and multi-scale emotion-enhanced inference model based on fuzzy recognition
Yu, Yan
Qiu, Dong
Yan, Ruiteng
COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (02) : 1071 - 1084
[4] Emotion recognition from EEG based on multi-task learning with capsule network and attention mechanism
Li, Chang
Wang, Bin
Zhang, Silin
Liu, Yu
Song, Rencheng
Cheng, Juan
Chen, Xun
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 143
[5] EEG emotion recognition based on the attention mechanism and pre-trained convolution capsule network
Liu, Shuaiqi
Wang, Zeyao
An, Yanling
Zhao, Jie
Zhao, Yingying
Zhang, Yu-Dong
KNOWLEDGE-BASED SYSTEMS, 2023, 265
[6] An Improved U-Net Model Based on Multi-Scale Input and Attention Mechanism: Application for Recognition of Chinese Cabbage and Weed
Ma, Zhongyang
Wang, Gang
Yao, Jurong
Huang, Dongyan
Tan, Hewen
Jia, Honglei
Zou, Zhaobo
SUSTAINABILITY, 2023, 15 (07)
[7] SwinMin: A mineral recognition model incorporating convolution and multi-scale contexts into swin transformer
Jia, Liqin
Chen, Feng
Yang, Mei
Meng, Fang
He, Mingyue
Liu, Hongmin
COMPUTERS & GEOSCIENCES, 2024, 184
[8] Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism
Li J.-Y.
Yang J.
Kong B.
Wang C.
Zhang L.
Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2021, 29 (06): : 1448 - 1458
[9] Research on cassava disease classification using the multi-scale fusion model based on EfficientNet and attention mechanism
Liu, Mingxin
Liang, Haofeng
Hou, Mingxin
FRONTIERS IN PLANT SCIENCE, 2022, 13
[10] A Dynamic Multi-Scale Convolution Model for Face Recognition Using Event-Related Potentials
Li, Shengkai
Zhang, Tonglin
Yang, Fangmei
Li, Xian
Wang, Ziyang
Zhao, Dongjie
SENSORS, 2024, 24 (13)

← 1 2 3 4 5 →