Multi-Modal Sarcasm Detection Based on Cross-Modal Composition of Inscribed Entity Relations

被引:7
作者
Li, Lingshan [1 ]
Jin, Di [1 ]
Wang, Xiaobao [1 ]
Guo, Fengyu [1 ]
Wang, Longbiao [1 ]
Dang, Jianwu [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China
来源
2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI | 2023年
关键词
Multi-modal sarcasm detection; Graph neural network; Sarcasm detection;
D O I
10.1109/ICTAI59109.2023.00138
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sarcasm, a linguistic technique employed to express emotions opposite to their literal meaning, has garnered significant attention from researchers due to the rise of social media. Detecting sarcasm in a multi-modal context has become a focal point in recent studies. However, existing research primarily relies on identifying inconsistencies between text semantics and image semantics, often lacking a deep understanding of images. Consequently, capturing inconsistencies between images and texts poses a challenge in many cases. In this paper, we propose the Entity-Relational Graph Convolutional Network (ERGCN) as a solution to detect sarcasm by examining the relationship between entities within images. Our approach involves extracting entities and text descriptions from each image, which provides valuable entity information. Subsequently, we employ external knowledge to construct a cross-modal graph for each text and image pair, emphasizing the presence of internal contradictory information. Finally, we utilize the graph convolutional network to identify inconsistent information across modalities and successfully detect sarcasm. Experimental results demonstrate that our model achieves state-of-the-art performance on a widely used multi-modal Twitter dataset.
引用
收藏
页码:918 / 925
页数:8
相关论文
共 29 条
[1]   Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering [J].
Anderson, Peter ;
He, Xiaodong ;
Buehler, Chris ;
Teney, Damien ;
Johnson, Mark ;
Gould, Stephen ;
Zhang, Lei .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6077-6086
[2]  
Baziotis Christos, 2018, P 12 INT WORKSH SEM
[3]  
Cai YT, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P2506
[4]  
Cambria E., 2016, INT C COMP LING
[5]   SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis [J].
Cambria, Erik ;
Li, Yang ;
Xing, Frank Z. ;
Poria, Soujanya ;
Kwok, Kenneth .
CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, :105-114
[6]  
Castro S, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P4619
[7]  
Davidov D., 2010, Proceedings of CoNLL, P107
[8]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[9]   MUTING THE MEANING - A SOCIAL FUNCTION OF IRONY [J].
DEWS, S ;
WINNER, E .
METAPHOR AND SYMBOLIC ACTIVITY, 1995, 10 (01) :3-19
[10]  
Dosovitskiy A., 2020, ICLR 2021