Disentangled variational auto-encoder for multimodal fusion performance analysis in multimodal sentiment analysis

被引:0
|
作者
Chen, Rongfei [1 ]
Zhou, Wenju [1 ]
Hu, Huosheng [2 ]
Fei, Zixiang [3 ]
Fei, Minrui [1 ]
Zhou, Hao [4 ]
机构
[1] Shanghai Univ, Sch Mechatron Engn & Automat, Shanghai Key Lab Power Stn Automat Technol, Shanghai 200444, Peoples R China
[2] Univ Essex, Sch Comp Sci & Elect Engn, Colchester CO4 3SQ, England
[3] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
[4] Univ Oxford, Dept Comp Sci, Oxford OX1 2JD, England
关键词
Multimodal sentiment analysis; Model performance evaluation; Disentangled representation learning; EXPLAINABILITY;
D O I
10.1016/j.knosys.2024.112372
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal Sentiment Analysis (MSA) holds extensive applicability owing to its capacity to analyze and interpret users' emotions, feelings, and perspectives by integrating complementary information from multiple modalities. However, inefficient and unbalanced cross-modal information fusion substantially undermines the accuracy and reliability of MSA models. Consequently, a critical challenge in the field now lies in effectively assessing the information integration capabilities of these models to ensure balanced and equitable processing of multimodal data. In this paper, a Disentanglement-based Variable Auto-Encoder (DVAE) is proposed for systematically assessing fusion performance and investigating the factors that facilitate multimodal fusion. Specifically, a distribution constraint module is presented to decouple the fusion matrices and generate multiple low-dimensional and trustworthy disentangled latent vectors that adhere to the authentic unimodal input distribution. In addition, a combined loss term is modified to effectively balance inductive bias, signal reconstruction, and distribution constraint items to facilitate the optimization of neural network weights and parameters. Utilizing the proposed evaluation method, we can evaluate the fusion performance of multimodal models by contrasting the classification degradation ratio derived from disentangled hidden representations and joint representations. Experiments conducted with eight state-of-the-art multimodal fusion methods on the CMU-MOSEI and CMU-MOSEI benchmark datasets demonstrate that DVAE is capable of effectively evaluating the effects of multimodal fusion. Moreover, the comparative experimental results indicate that the equalizing effect among various advanced mechanisms in multimodal sentiment analysis, as well as the single-peak characteristic of the ground label distribution, both contribute significantly to multimodal data fusion.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] Language Reinforced Superposition Multimodal Fusion for Sentiment Analysis
    He, Jiaxuan
    Hu, Haifeng
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1347 - 1351
  • [32] Review of Multimodal Sensor Data Fusion in Sentiment Analysis
    Jin, Yelei
    Gulanbaier, Tuerhong
    Mairidan, Wushouer
    Computer Engineering and Applications, 2023, 59 (23) : 1 - 14
  • [33] Greedy Fusion Oriented Representations for Multimodal Sentiment Analysis
    Xu, Ran
    PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 569 - 581
  • [34] Fusion-Extraction Network for Multimodal Sentiment Analysis
    Jiang, Tao
    Wang, Jiahai
    Liu, Zhiyue
    Ling, Yingbiao
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT II, 2020, 12085 : 785 - 797
  • [35] WConF: Weighted Contrastive Fusion for Multimodal Sentiment Analysis
    Zeng, Biqing
    Li, Ruiyuan
    Lu, Liuxing
    Lu, Liangqi
    Wang, Jiazhen
    Chen, Weihai
    Dong, Huimin
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT V, NLPCC 2024, 2025, 15363 : 30 - 42
  • [36] A Review of Multimodal Sentiment Analysis : Modal Fusion and Representation
    Fu, Hanxue
    Lu, Huimin
    20TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC 2024, 2024, : 49 - 54
  • [37] Multimodal Sentiment Analysis Based on Composite Hierarchical Fusion
    Lei, Yu
    Qu, Keshuai
    Zhao, Yifan
    Han, Qing
    Wang, Xuguang
    COMPUTER JOURNAL, 2024, 67 (06): : 2230 - 2245
  • [38] Survey of Sentiment Analysis Algorithms Based on Multimodal Fusion
    Guo, Xu
    Mairidan, Wushouer
    Gulanbaier, Tuerhong
    Computer Engineering and Applications, 2024, 60 (02) : 1 - 18
  • [39] Variational Auto-Encoder for text generation
    Hu, Haojin
    Liao, Mengfan
    Mao, Weiming
    Liu, Wei
    Zhang, Chao
    Jing, Yanmei
    PROCEEDINGS OF 2020 IEEE 5TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2020), 2020, : 595 - 598
  • [40] A Sentimental Prompt Framework with Visual Text Encoder for Multimodal Sentiment Analysis
    Huang, Shizhou
    Xu, Bo
    Li, Changqun
    Ye, Jiabo
    Lin, Xin
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 638 - 646