CubeMLP: A MLP-based Model for Multimodal Sentiment Analysis and Depression Estimation

被引:55
|
作者
Sun, Hao [1 ]
Wang, Hongyi [1 ]
Liu, Jiaqing [2 ]
Chen, Yen-Wei [2 ]
Lin, Lanfen [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
[2] Ritsumeikan Univ, Coll Informat Sci & Engn, Kusatsu, Shiga, Japan
来源
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022 | 2022年
关键词
multimodal processing; multimodal fusion; multimodal interaction; multimedia; MLP; sentiment analysis; depression detection;
D O I
10.1145/3503161.3548025
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Multimodal sentiment analysis and depression estimation are two important research topics that aim to predict human mental states using multimodal data. Previous research has focused on developing effective fusion strategies for exchanging and integrating mind-related information from different modalities. Some MLP-based techniques have recently achieved considerable success in a variety of computer vision tasks. Inspired by this, we explore multimodal approaches with a feature-mixing perspective in this study. To this end, we introduce CubeMLP, a multimodal feature processing framework based entirely on MLP. CubeMLP consists of three independent MLP units, each of which has two affine transformations. CubeMLP accepts all relevant modality features as input and mixes them across three axes. After extracting the characteristics using CubeMLP, the mixed multimodal features are flattened for task predictions. Our experiments are conducted on sentiment analysis datasets: CMU-MOSI and CMU-MOSEI, and depression estimation dataset: AVEC2019. The results show that CubeMLP can achieve state-of-the-art performance with a much lower computing cost.
引用
收藏
页码:3722 / 3729
页数:8
相关论文
共 50 条
  • [31] Transformer-based adaptive contrastive learning for multimodal sentiment analysis
    Hu Y.
    Huang X.
    Wang X.
    Lin H.
    Zhang R.
    Multimedia Tools and Applications, 2025, 84 (3) : 1385 - 1402
  • [32] Emoji multimodal microblog sentiment analysis based on mutual attention mechanism
    Lou, Yinxia
    Zhou, Junxiang
    Zhou, Jun
    Ji, Donghong
    Zhang, Qing
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [33] Sentiment analysis based on text information enhancement and multimodal feature fusion
    Liu, Zijun
    Cai, Li
    Yang, Wenjie
    Liu, Junhui
    PATTERN RECOGNITION, 2024, 156
  • [34] Multimodal Sentiment Analysis Model Integrating Multi-features and Attention Mechanism
    Lyu X.
    Tian C.
    Zhang L.
    Du Y.
    Zhang X.
    Cai Z.
    Data Analysis and Knowledge Discovery, 2024, 8 (05) : 91 - 101
  • [35] Semisupervised Hierarchical Subspace Learning Model for Multimodal Social Media Sentiment Analysis
    Han, Xue
    Cheng, Honlin
    Ding, Jike
    Yan, Suqin
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 3446 - 3454
  • [36] Personalized Travel Recommendation Based on Sentiment-Aware Multimodal Topic Model
    Shao, Xi
    Tang, Guijin
    Bao, Bing-Kun
    IEEE ACCESS, 2019, 7 : 113043 - 113052
  • [37] Sentiment Analysis of Sina Weibo Based on Semantic Sentiment Space Model
    Huang He
    2013 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND ENGINEERING (ICMSE), 2013, : 206 - 211
  • [38] Visual sentiment topic model based microblog image sentiment analysis
    Cao, Donglin
    Ji, Rongrong
    Lin, Dazhen
    Li, Shaozi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (15) : 8955 - 8968
  • [39] Visual sentiment topic model based microblog image sentiment analysis
    Donglin Cao
    Rongrong Ji
    Dazhen Lin
    Shaozi Li
    Multimedia Tools and Applications, 2016, 75 : 8955 - 8968
  • [40] Microblog Sentiment Analysis Model Based on Emoticons
    Pei, Shaojie
    Zhang, Lumin
    Li, Aiping
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, PT II, 2014, 8710 : 127 - 135