Linguistic features based framework for automatic fake news detection

被引：39

作者：

Garg, Sonal ^{[1
]}

Sharma, Dilip Kumar ^{[1
]}

机构：

[1] GLA Univ, Mathura, India

来源：

COMPUTERS & INDUSTRIAL ENGINEERING | 2022年 / 172卷

关键词：

Artificial Intelligence; Linguistic features; Machine-learning; Statistical Measure; Text classification; DECEPTION; CUES;

D O I：

10.1016/j.cie.2022.108432

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Social media platforms now a day are mainly used for news consumption among users. Political groups use social media platforms to attract users by enclosing users' votes in their favor. Due to the large volume of data on social media, it is essential to verify the authenticity of the content. The use of artificial intelligence techniques including the development of embedding and deployment of the machine-learning algorithm is required to combat misinformation. This paper focused on various categories of linguistic features covering complexity features, readability index, psycholinguistic features, and stylometric features for competent fake news identi-fication. The linguistic model helps in computing language-driven features by learning the properties of news content. In this work, we have selected twenty-six significant features and applied various machine learning models for implementation. For feature extraction, three different techniques named term frequency-inverse document frequency (tf-idf), count vectorizer (CV), and hash-vectorizer (HV) are applied. Then, we tested those models in different training dataset sizes to obtain accuracy for each model and compared them. We used four existing datasets for the experiment. The proposed framework achieved 90.8 % accuracy using Reuter dataset. Buzzfeed dataset obtained highest of 90% accuracy. Random Political and Mc_Intire dataset achieved an accuracy of 93.8 and 86.9% respectively.

引用

页数：12

共 37 条

[1] Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques [J].