A Hybrid Linguistic and Knowledge-Based Analysis Approach for Fake News Detection on Social Media

被引:52
作者
Seddari, Noureddine [1 ,2 ]
Derhab, Abdelouahid [3 ]
Belaoued, Mohamed [1 ,4 ]
Halboob, Waleed [1 ,3 ]
Al-Muhtadi, Jalal [3 ,5 ]
Bouras, Abdelghani [6 ]
机构
[1] Univ 20 Aout 1955 Skikda, Dept Comp Sci, ILICUS Lab, Skikda 21000, Algeria
[2] Univ Abdelhamid Mehri Constantine 2, LIRE Lab, Constantine 25000, Algeria
[3] King Saud Univ, Ctr Excellence Informat Assurance CoEIA, Riyadh 11653, Saudi Arabia
[4] Univ Reims, CReSTIC, F-51100 Reims, France
[5] King Saud Univ, Coll Comp & Informat Sci, Riyadh 11653, Saudi Arabia
[6] Alfaisal Univ, Dept Ind Engn, Coll Engn, Riyadh 11533, Saudi Arabia
关键词
Fake news; Feature extraction; Linguistics; Knowledge based systems; Syntactics; Social networking (online); Semantics; Social media; fake news detection; linguistic analysis; knowledge analysis; fact-checking website; TOKEN RATIO; INFORMATION; CUES;
D O I
10.1109/ACCESS.2022.3181184
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The rapid development of different social media and content-sharing platforms has been largely exploited to spread misinformation and fake news that make people believing in harmful stories, which allow to influence public opinion, and could cause panic and chaos among population. Thus, fake news detection has become an important research topic, aiming at flagging a specific content as fake or legitimate. The fake news detection solutions can be divided into three main categories: content-based, social context-based, and knowledge-based approaches. In this paper, we propose a novel hybrid fake news detection system that combines linguistic and knowledge-based approaches and inherits their advantages, by employing two different sets of features: (1) linguistic features (i.e., title, number of words, reading ease, lexical diversity,and sentiment), and (2) a novel set of knowledge-based features, called fact-verification features that comprise three types of information namely, (i) reputation of the website where the news is published, (ii) coverage, i.e., number of sources that published the news, and (iii) fact-check, i.e., opinion of well-known fact-checking websites about the news, i.e., true or false. The proposed system only employs eight features, which is less than most of the state-of-the-art approaches. Also, the evaluation results on a fake news dataset show that the proposed system employing both types of features can reach an accuracy of 94.4%, which is better compared to that obtained from separately employing linguistic features (i.e., accuracy=89.4%) and fact-verification features (i.e., accuracy=81.2%).
引用
收藏
页码:62097 / 62109
页数:13
相关论文
共 87 条
[1]   Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques [J].
Ahmed, Hadeer ;
Traore, Issa ;
Saad, Sherif .
INTELLIGENT, SECURE, AND DEPENDABLE SYSTEMS IN DISTRIBUTED AND CLOUD ENVIRONMENTS (ISDDC 2017), 2017, 10618 :127-138
[2]   An Evolutionary Fake News Detection Method for COVID-19 Pandemic Information [J].
Al-Ahmad, Bilal ;
Al-Zoubi, Ala' M. ;
Abu Khurma, Ruba ;
Aljarah, Ibrahim .
SYMMETRY-BASEL, 2021, 13 (06)
[3]  
[Anonymous], 2021, PROC EUR SEMANTIC WE, DOI DOI 10.1007/978-3-030-80418-3
[4]  
[Anonymous], 2017, Talos Targets Disinformation with Fake News Challenge Victory
[5]  
[Anonymous], 2015, P 2015 ACM WORKSH MU, DOI DOI 10.1145/2823465.2823467
[6]  
[Anonymous], 2009, arXiv preprint arXiv:1901.06543
[7]  
Benoit K., 2018, Journal of Open Source Software, V3, DOI DOI 10.21105/JOSS.00774
[8]   MILEY, CNN AND THE ONION When fake news becomes realer than real [J].
Berkowitz, Dan ;
Schwartz, David Asa .
JOURNALISM PRACTICE, 2016, 10 (01) :1-17
[9]  
Biyani P, 2016, AAAI CONF ARTIF INTE, P94
[10]  
Bourgonje P., 2017, P 2017 EMNLP WORKSH, P84, DOI DOI 10.18653/V1/W17-4215