Modeling inter-modal incongruous sentiment expressions for multi-modal sarcasm detection

被引:3
|
作者
Ou, Lisong [1 ,2 ,3 ]
Li, Zhixin [1 ,2 ]
机构
[1] Guangxi Normal Univ, Key Lab Educ Blockchain & Intelligent Technol, Minist Educ, Guilin 541004, Peoples R China
[2] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China
[3] Guilin Univ Technol, Sch Math & Stat, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-modal sarcasm detection; Graph convolutional network; Cross-modal mapping; External knowledge; Cross-correlation graphs;
D O I
10.1016/j.neucom.2024.128874
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-modal sarcasm detection (MSD) presents a formidable and intricate endeavor. Despite strides made by extant models, two principal hurdles persist: Firstly, prevailing methodologies merely address superficial disparities between textual inputs and associated images, neglecting nuanced inter-modal combinations. Secondly, satirical instances frequently involve intricate emotional expressions, highlighting the imperative of leveraging emotional cues across modalities to discern satirical nuances. Accordingly, this research proposes the utilization of a deep graph convolutional network that integrates cross-modal mapping information to effectively identify significant incongruent sentiment expressions across various modalities for the purpose of multi-modal sarcasm detection. Specifically, we first design a cross-modal mapping network, which obtains the interaction information between these two modalities by mapping text feature vectors and image feature vectors two by two to compensate for the lack of multi-modal data in the fusion process. Additionally, we employ external knowledge of ANPS as abridge to construct cross-correlation graphs through highly correlated satirical cues and their connection weights between image and text modalities. Afterward, the GCN architecture with retrieval-based attentional mechanisms will effectively capture satirical cues. The experiments conducted on two publicly available datasets demonstrate a significant enhancement in the performance of our method when compared to numerous contemporary models.
引用
收藏
页数:11
相关论文
共 29 条
  • [21] Knowledge-Based Visual Question Answering Using Multi-Modal Semantic Graph
    Jiang, Lei
    Meng, Zuqiang
    ELECTRONICS, 2023, 12 (06)
  • [22] Sparse Interpretation of Graph Convolutional Networks for Multi-modal Diagnosis of Alzheimer's Disease
    Zhou, Houliang
    Zhang, Yu
    Chen, Brian Y.
    Shen, Li
    He, Lifang
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VIII, 2022, 13438 : 469 - 478
  • [23] Multi-Modal Diagnosis of Alzheimer's Disease Using Interpretable Graph Convolutional Networks
    Zhou, Houliang
    He, Lifang
    Chen, Brian Y.
    Shen, Li
    Zhang, Yu
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2025, 44 (01) : 142 - 153
  • [24] An effective multi-modal adaptive contextual feature information fusion method for Chinese long text classification
    Xu, Yangshuyi
    Liu, Guangzhong
    Zhang, Lin
    Shen, Xiang
    Luo, Sizhe
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (09)
  • [25] Phishing Webpage Detection via Multi-Modal Integration of HTML']HTML DOM Graphs and URL Features Based on Graph Convolutional and Transformer Networks
    Yoon, Jun-Ho
    Buu, Seok-Jun
    Kim, Hae-Jung
    ELECTRONICS, 2024, 13 (16)
  • [26] Fuel consumption prediction for pre-departure flights using attention-based multi-modal fusion
    Lin, Yi
    Guo, Dongyue
    Wu, Yuankai
    Li, Lishuai
    Wu, Edmond Q.
    Ge, Wenyi
    INFORMATION FUSION, 2024, 101
  • [27] Video–text retrieval via multi-modal masked transformer and adaptive attribute-aware graph convolutional network
    Gang Lv
    Yining Sun
    Fudong Nian
    Multimedia Systems, 2024, 30
  • [28] Video-text retrieval via multi-modal masked transformer and adaptive attribute-aware graph convolutional network
    Lv, Gang
    Sun, Yining
    Nian, Fudong
    MULTIMEDIA SYSTEMS, 2024, 30 (01)
  • [29] Attention-Based Node-Edge Graph Convolutional Networks for Identification of Autism Spectrum Disorder Using Multi-Modal MRI Data
    Chen, Yuzhong
    Yan, Jiadong
    Jiang, Mingxin
    Zhao, Zhongbo
    Zhao, Weihua
    Zhang, Rong
    Kendrick, Keith M.
    Jiang, Xi
    PATTERN RECOGNITION AND COMPUTER VISION,, PT III, 2021, 13021 : 374 - 385