Linguistic features based framework for automatic fake news detection

被引:29
作者
Garg, Sonal [1 ]
Sharma, Dilip Kumar [1 ]
机构
[1] GLA Univ, Mathura, India
关键词
Artificial Intelligence; Linguistic features; Machine-learning; Statistical Measure; Text classification; DECEPTION; CUES;
D O I
10.1016/j.cie.2022.108432
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Social media platforms now a day are mainly used for news consumption among users. Political groups use social media platforms to attract users by enclosing users' votes in their favor. Due to the large volume of data on social media, it is essential to verify the authenticity of the content. The use of artificial intelligence techniques including the development of embedding and deployment of the machine-learning algorithm is required to combat misinformation. This paper focused on various categories of linguistic features covering complexity features, readability index, psycholinguistic features, and stylometric features for competent fake news identi-fication. The linguistic model helps in computing language-driven features by learning the properties of news content. In this work, we have selected twenty-six significant features and applied various machine learning models for implementation. For feature extraction, three different techniques named term frequency-inverse document frequency (tf-idf), count vectorizer (CV), and hash-vectorizer (HV) are applied. Then, we tested those models in different training dataset sizes to obtain accuracy for each model and compared them. We used four existing datasets for the experiment. The proposed framework achieved 90.8 % accuracy using Reuter dataset. Buzzfeed dataset obtained highest of 90% accuracy. Random Political and Mc_Intire dataset achieved an accuracy of 93.8 and 86.9% respectively.
引用
收藏
页数:12
相关论文
共 37 条
  • [1] Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques
    Ahmed, Hadeer
    Traore, Issa
    Saad, Sherif
    [J]. INTELLIGENT, SECURE, AND DEPENDABLE SYSTEMS IN DISTRIBUTED AND CLOUD ENVIRONMENTS (ISDDC 2017), 2017, 10618 : 127 - 138
  • [2] A Credibility Analysis System for Assessing Information on Twitter
    Alrubaian, Majed
    Al-Qurishi, Muhammad
    Hassan, Mohammad Mehedi
    Alamri, Atif
    [J]. IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2018, 15 (04) : 661 - 674
  • [3] [Anonymous], 2003, DETECTING DECEPTION, DOI DOI 10.1007/3-540-44853-5_7
  • [4] [Anonymous], 2017, HORNE2017 FAKENEWSDA
  • [5] [Anonymous], 2020, Digital News Report 2020
  • [6] [Anonymous], MCINTIRE FAKE NEWS D
  • [7] Automatically Identifying Fake News in Popular Twitter Threads
    Buntain, Cody
    Golbeck, Jennifer
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD), 2017, : 208 - 215
  • [8] Effective fake news video detection using domain knowledge and multimodal data fusion on youtube
    Choi, Hyewon
    Ko, Youngjoong
    [J]. PATTERN RECOGNITION LETTERS, 2022, 154 : 44 - 52
  • [9] Polarization and Fake News: Early Warning of Potential Misinformation Targets
    Del Vicario, Michela
    Quattrociocchi, Walter
    Scala, Antonio
    Zollo, Fabiana
    [J]. ACM TRANSACTIONS ON THE WEB, 2019, 13 (02)
  • [10] A comprehensive Benchmark for fake news detection
    Galli, Antonio
    Masciari, Elio
    Moscato, Vincenzo
    Sperli, Giancarlo
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2022, 59 (01) : 237 - 261