Shared and Private Information Learning in Multimodal Sentiment Analysis with Deep Modal Alignment and Self-supervised Multi-Task Learning

被引:3
作者
Lai, Songning [1 ]
Li, Jiakang [5 ]
Guo, Guinan [6 ]
Hu, Xifeng [1 ]
Li, Yulong [3 ]
Tan, Yuan [5 ]
Song, Zichen [5 ]
Liu, Yutong [7 ]
Ren, Zhaoxia [4 ]
Wang, Chun [4 ]
Miao, Danmin [3 ]
Liu, Zhi [1 ,2 ]
机构
[1] Shandong Univ, Sch Informat Sci & Engn, Qingdao, Peoples R China
[2] State Key Lab High Performance Server & Storage T, Jinan, Peoples R China
[3] Air Force Med Univ, Dept Mil Med Psychol, Xian, Peoples R China
[4] Shandong Univ, Assets & Lab Management Dept, Qingdao, Peoples R China
[5] Lanzhou Univ, Sch Informat Sci & Engn, Lanzhou, Peoples R China
[6] Sun Yat Sen Univ, Geog & Planning, Guangzhou, Peoples R China
[7] Natl Univ Singapore, Singapore, Singapore
来源
2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024 | 2024年
关键词
ultimodal sentiment analysis; multi-task learning; modal alignmentultimodal sentiment analysis; modal alignmentm;
D O I
10.1109/IJCNN60899.2024.10651442
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Designing an effective representation learning method for multimodal sentiment analysis is a critical research area. The primary challenge is capturing shared and private information within a comprehensive modal representation, especially when dealing with uniform multimodal labels and raw feature fusion.To overcome this challenge, we propose a novel deep modal shared information learning module that utilizes the covariance matrix to capture shared information across modalities. Additionally, we introduce a label generation module based on a self-supervised learning strategy to capture the private information specific to each modality. Our module can be easily integrated into multimodal tasks and offers flexibility by allowing parameter adjustment to control the information exchange relationship between modes, facilitating the learning of private or shared information as needed. To further enhance performance, we employ a multi-task learning strategy that enables the model to focus on modal differentiation during training. We provide a detailed formulation derivation and feasibility proof for the design of the deep modal shared information learning module.To evaluate our approach, we conduct extensive experiments on three common multimodal sentiment analysis benchmark datasets. The experimental results validate the reliability of our model, demonstrating its effectiveness in capturing nuanced information in multimodal sentiment analysis tasks.
引用
收藏
页数:8
相关论文
共 63 条
[1]   Gated multimodal networks [J].
Arevalo, John ;
Solorio, Thamar ;
Montes-y-Gomez, Manuel ;
Gonzalez, Fabio A. .
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (14) :10209-10228
[2]  
Chauhan DS, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P4351
[3]  
Chen Zhihao, 2023, 2023 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML), P1104, DOI 10.1109/ICICML60161.2023.10424753
[4]  
Chen Z., 2024, ARXIV
[5]   EMOPAIN Challenge 2020: Multimodal Pain Evaluation from Facial and Bodily Expressions [J].
Egede, Joy O. ;
Song, Siyang ;
Olugbade, Temitayo A. ;
Wang, Chongyang ;
Williams, Amanda C. De C. ;
Meng, Hongying ;
Aung, Min ;
Lane, Nicholas D. ;
Valstar, Michel ;
Bianchi-Berthouze, Nadia .
2020 15TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2020), 2020, :849-856
[6]  
Eyben Florian., 2010, P 18 ACM INT C MULT, DOI [10.1145/1873951.1874246, DOI 10.1145/1873951.1874246]
[7]  
Gandhi Ankita, 2022, Inf. Fusion
[8]  
Ge Yiyuan, 2023, 2023 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML), P1108, DOI 10.1109/ICICML60161.2023.10424954
[9]  
Ge Y., 2023, P INT C APPL INT DEC, P339
[10]  
Gers FA, 1999, IEE CONF PUBL, P850, DOI [10.1049/cp:19991218, 10.1162/089976600300015015]