Automating fake news detection system using multi-level voting model

被引:78
作者
Kaur, Sawinder [1 ]
Kumar, Parteek [2 ]
Kumaraguru, Ponnurangam [3 ]
机构
[1] TIET, Comp Sci & Engn Dept, Doctoral Res Lab 2, Patiala, Punjab, India
[2] TIET, Comp Sci & Engn Dept, Patiala, Punjab, India
[3] IIIT, Comp Sci & Engn Dept, Delhi, India
关键词
Fake news articles; Count-Vectorizer; TF-IDF; Hashing-Vectorizer; Classifiers; Textual content; Machine learning models;
D O I
10.1007/s00500-019-04436-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The issues of online fake news have attained an increasing eminence in the diffusion of shaping news stories online. Misleading or unreliable information in the form of videos, posts, articles, URLs is extensively disseminated through popular social media platforms such as Facebook and Twitter. As a result, editors and journalists are in need of new tools that can help them to pace up the verification process for the content that has been originated from social media. Motivated by the need for automated detection of fake news, the goal is to find out which classification model identifies phony features accurately using three feature extraction techniques, Term Frequency-Inverse Document Frequency (TF-IDF), Count-Vectorizer (CV) and Hashing-Vectorizer (HV). Also, in this paper, a novel multi-level voting ensemble model is proposed. The proposed system has been tested on three datasets using twelve classifiers. These ML classifiers are combined based on their false prediction ratio. It has been observed that the Passive Aggressive, Logistic Regression and Linear Support Vector Classifier (LinearSVC) individually perform best using TF-IDF, CV and HV feature extraction approaches, respectively, based on their performance metrics, whereas the proposed model outperforms the Passive Aggressive model by 0.8%, Logistic Regression model by 1.3%, LinearSVC model by 0.4% using TF-IDF, CV and HV, respectively. The proposed system can also be used to predict the fake content (textual form) from online social media websites.
引用
收藏
页码:9049 / 9069
页数:21
相关论文
共 43 条
  • [1] Aggarwal A, 2018, FOLLOWER COUNT FALLA
  • [2] Ahmed F, 2012, 2012 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), P265, DOI 10.1109/ICCE.2012.6161859
  • [3] Alahmadi A, 2013, IEEE GCC CONF EXHIB, P108, DOI 10.1109/IEEEGCC.2013.6705759
  • [4] [Anonymous], 2012, OXIDAT STRESS DIS
  • [5] Getting out the truth: the role of libraries in the fight against fake news
    Batchelor, Oliver
    [J]. REFERENCE SERVICES REVIEW, 2017, 45 (02) : 143 - 148
  • [6] Detecting Spammers and Content Promoters in Online Video Social Networks
    Benevenuto, Fabricio
    Rodrigues, Tiago
    Almeida, Virgilio
    Almeida, Jussara
    Goncalves, Marcos
    [J]. PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 620 - 627
  • [7] Benevenuto Fabricio., 2010, CEAS
  • [8] Caetano Josemar Alves, 2018, ARXIV PREPRINT ARXIV
  • [9] Canini Kevin Robert, 2011, 2011 IEEE 3 INT C PR, P1, DOI DOI 10.1109/PASSAT/SOCIALCOM.2011.91
  • [10] Chen Y., 2015, P 2015 ACM WORKSHOP, P15, DOI DOI 10.1145/2823465.2823467