Improving Fake News Detection by Using an Entity-enhanced Framework to Fuse Diverse Multimodal Clues

被引:72
作者
Qi, Peng [1 ,2 ,3 ]
Cao, Juan [1 ,2 ]
Li, Xirong [4 ]
Liu, Huan [5 ]
Sheng, Qiang [1 ,2 ]
Mi, Xiaoyue [1 ,2 ]
He, Qin [6 ]
Lv, Yongbiao [6 ]
Guo, Chenyang [6 ]
Yu, Yingchao [6 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Inst Artificial Intelligence, Hebi, Peoples R China
[4] Renmin Univ China, Key Lab DEKE, Beijing, Peoples R China
[5] Zhengzhou Univ, Zhengzhou, Peoples R China
[6] Hangzhou ZhongkeRuijian Technol Co Ltd, Hangzhou, Peoples R China
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年
基金
中国国家自然科学基金;
关键词
fake news detection; multimodal fusion; visual entity; social media;
D O I
10.1145/3474085.3481548
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, fake news with text and images have achieved more effective difusion than text-only fake news, raising a severe issue of multimodal fake news detection. Current studies on this issue have made significant contributions to developing multimodal models, but they are defective in modeling the multimodal content sufficiently. Most of them only preliminarily model the basic semantics of the images as a supplement to the text, which limits their performance on detection. In this paper, we find three valuable text-image correlations in multimodal fake news: entity inconsistency, mutual enhancement, and text complementation. To effectively capture these multimodal clues, we innovatively extract visual entities (such as celebrities and landmarks) to understand the news-related highlevel semantics of images, and then model the multimodal entity inconsistency and mutual enhancement with the help of visual entities. Moreover, we extract the embedded text in images as the complementation of the original text. All things considered, we propose a novel entity-enhanced multimodal fusion framework, which simultaneously models three cross-modal correlations to detect diverse multimodal fake news. Extensive experiments demonstrate the superiority of our model compared to the state of the art.
引用
收藏
页码:1212 / 1220
页数:9
相关论文
共 35 条
[1]  
Boididou C., 2016, WORK NOT P MEDIAEVAL
[2]  
Boididou Christina, 2015, WORK NOT P MEDIAEVAL
[3]  
Cao J., 2020, DISINFORMATION MISIN, P141, DOI [DOI 10.1007/978-3-030-42699-6_8, DOI 10.1007/978-3-030-42699-6, 10.1007/978-3-030-42699-6_8, DOI 10.1007/978]
[4]  
Castillo C., 2011, P 20 INT C WORLD WID, P675, DOI [10.1145/1963405.1963500, DOI 10.1145/1963405.1963500]
[5]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[6]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]   The Future of False Information Detection on Social Media: New Perspectives and Trends [J].
Guo, Bin ;
Ding, Yasan ;
Yao, Lina ;
Liang, Yunji ;
Yu, Zhiwen .
ACM COMPUTING SURVEYS, 2020, 53 (04)
[8]  
Jha Jing, 2021, P 27 ACM SIGKDD INT
[9]   Characteristics and Surgical Results of Patients with Hypertrophic Obstructive Cardiomyopathy without Intrinsic Mitral Valve Diseases Undergoing Mitral Subvalvular Procedures during Myectomy [J].
Ji, Qiang ;
Wang, YuLin ;
Yang, Ye ;
Xia, LiMin ;
Ding, WenJun ;
Song, Kai ;
Wang, ChunSheng .
CARDIOLOGY RESEARCH AND PRACTICE, 2020, 2020
[10]   Multimodal Fusion with Recurrent Neural Networks for Rumor Detection on Microblogs [J].
Jin, Zhiwei ;
Cao, Juan ;
Guo, Han ;
Zhang, Yongdong ;
Luo, Jiebo .
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, :795-803