Evaluation of Data Inconsistency for Multi-modal Sentiment Analysis

被引：0

作者：

Wang, Yufei ^{[1
]}

Wu, Mengyue ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Shanghai 200000, Peoples R China

来源：

MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024 | 2025年 / 2312卷

关键词：

Multi-modal Sentiment Analysis; Multi-modal Large Language Model; Data Inconsistency;

D O I：

10.1007/978-981-96-1045-7_25

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Emotion semantic inconsistency is a ubiquitous challenge in multi-modal sentiment analysis (MSA). MSA involves analyzing sentiment expressed across various modalities like text, audio, and videos. Each modality may convey distinct aspects of sentiment, due to the subtle and nuanced expression of human beings, leading to inconsistency, which may hinder the prediction of artificial agents. In this work, we introduce a modality-conflicting test set and assess the performance of both traditional multi-modal sentiment analysis models and multi-modal large language models (MLLMs). Our findings reveal significant performance degradation across traditional models when confronted with semantically conflicting data and point out the drawbacks of MLLMs when handling multi-modal emotion analysis. Our research presents a new challenge and offers valuable insights for the future development of sentiment analysis systems.

引用

页码：299 / 310

页数：12

共 50 条

[31] Video-Based Cross-Modal Auxiliary Network for Multimodal Sentiment Analysis
Chen, Rongfei
Zhou, Wenju
Li, Yang
Zhou, Huiyu
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8703 - 8716
[32] Enhancing Cross-Modal Alignment in Multimodal Sentiment Analysis via Prompt Learning
Wang, Xiaofan
Li, Xiuhong
Li, Zhe
Zhou, Chenyu
Chen, Fan
Yang, Dan
[J]. PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 541 - 554
[33] MULTI-CHANNEL ATTENTIVE GRAPH CONVOLUTIONAL NETWORK WITH SENTIMENT FUSION FOR MULTIMODAL SENTIMENT ANALYSIS
Xiao, Luwei
Wu, Xingjiao
Wu, Wen
Yang, Jing
He, Liang
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4578 - 4582
[34] Mual: enhancing multimodal sentiment analysis with cross-modal attention and difference loss
Deng, Yang
Li, Yonghong
Xian, Sidong
Li, Laquan
Qiu, Haiyang
[J]. INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2024, 13 (03)
[35] Multi-View Interactive Representations for Multimodal Sentiment Analysis
Tang, Zemin
Xiao, Qi
Qin, Yunchuan
Zhou, Xu
Zhou, Joey Tianyi
Li, Kenli
[J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 4095 - 4107
[36] Multi-Task Momentum Distillation for Multimodal Sentiment Analysis
Lin, Ronghao
Hu, Haifeng
[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (02) : 549 - 565
[37] Multi-level sentiment-aware clustering for denoising in multimodal sentiment analysis with ASR errors
Hu, Zixu
Yu, Zhengtao
Guo, Junjun
[J]. MULTIMEDIA SYSTEMS, 2025, 31 (02)
[38] Multimodal Sentiment Analysis in Realistic Environments Based on Cross-Modal Hierarchical Fusion Network
Huang, Ju
Lu, Pengtao
Sun, Shuifa
Wang, Fangyi
[J]. ELECTRONICS, 2023, 12 (16)
[39] CM-BERT: Cross-Modal BERT for Text-Audio Sentiment Analysis
Yang, Kaicheng
Xu, Hua
Gao, Kai
[J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 521 - 528
[40] Multimodal Sentiment Analysis Network Based on Distributional Transformation and Gated Cross-Modal Fusion
Zhang, Yuchen
Thong, Hong
Chen, Guilin
Alhusaini, Naji
Zhao, Shenghui
Wu, Cheng
[J]. 2024 INTERNATIONAL CONFERENCE ON NETWORKING AND NETWORK APPLICATIONS, NANA 2024, 2024, : 496 - 503

← 1 2 3 4 5 →