Presenting a bi-classification model to detect fake news from textual data using artificial intelligence methods and text analysis techniques

被引:0
作者
Zhou, Wen [1 ]
机构
[1] Guangdong Univ Foreign Studies, Sch New Media & Int Commun, South China Business Coll, Guangzhou 510545, Guangdong, Peoples R China
来源
INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS | 2025年
关键词
fake news; artificial intelligence; Neural Network; TF-IDF; Principal Component Analysis; BERT; OPTIMIZATION; NETWORK;
D O I
10.1177/18724981251319628
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this investigation, an attempt was made to present a model for distinguishing fake news from truthful news in textual data. For this purpose, using intelligent methods and based on the principles of text analysis, a bi-classification model was presented that divided the textual data into deceptive and truthful classes. Basic algorithms based on artificial intelligence (AI) used for modeling consisting of Adaboost (Ada), Support Vector Classifier (SVC), Random Forest (RF), Neural Network (NN), BERT, and Convolutional Neural Network (ConvNet). Among the methods used in this study are the TF-IDF method for vectorization of textual data; the PCA (Principal Component Analysis) technique for feature transformation; the word2index as well as word embedding models for converting the words into numbers, and the N-gram technique to create a sequence of words. Finally, by conducting a case study and by examining different evaluation indices, the comparison of the offending models was done. The outcomes of this investigation showed that despite the high similarity between the two classes (86.6% similarity in train data and 79.8% similarity in test data), the BERT model had the best result compared to the others. This model has high complexity and can better extract relationships between data. In the basic article, the best value of the Accuracy index was 0.90, which was improved to 0.93 in this study.
引用
收藏
页数:12
相关论文
共 48 条
[1]   Principal component analysis [J].
Abdi, Herve ;
Williams, Lynne J. .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04) :433-459
[2]  
Abdullah All T., 2019, 2019 7 INT C SMART C
[3]  
Abdulrahman Awf, 2020, 2020 3rd International Conference on Advanced Science and Engineering (ICOASE), P18, DOI 10.1109/ICOASE51841.2020.9436605
[4]   Detecting opinion spams and fake news using text classification [J].
Ahmed, Hadeer ;
Traore, Issa ;
Saad, Sherif .
SECURITY AND PRIVACY, 2018, 1 (01)
[5]   Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques [J].
Ahmed, Hadeer ;
Traore, Issa ;
Saad, Sherif .
INTELLIGENT, SECURE, AND DEPENDABLE SYSTEMS IN DISTRIBUTED AND CLOUD ENVIRONMENTS (ISDDC 2017), 2017, 10618 :127-138
[6]   A hybrid particle swarm optimization and support vector regression model for modelling permeability prediction of hydrocarbon reservoir [J].
Akande, Kabiru O. ;
Owolabi, Taoreed O. ;
Olatunji, Sunday O. ;
AbdulRaheem, AbdulAzeez .
JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2017, 150 :43-53
[7]  
Akar, 2012, Jeodezi ve Jeoinformasyon Dergisi, V106, P139
[8]  
Ahmed AA, 2021, Arxiv, DOI [arXiv:2102.04458, DOI 10.48550/ARXIV.2102.04458]
[9]  
Al-Saffar AAM, 2017, 2017 INTERNATIONAL CONFERENCE ON RADAR, ANTENNA, MICROWAVE, ELECTRONICS, AND TELECOMMUNICATIONS (ICRAMET), P26, DOI 10.1109/ICRAMET.2017.8253139
[10]   Forecasting gold price fluctuations using improved multilayer perceptron neural network and whale optimization algorithm [J].
Alameer, Zakaria ;
Abd Elaziz, Mohamed ;
Ewees, Ahmed A. ;
Ye, Haiwang ;
Zhang Jianhua .
RESOURCES POLICY, 2019, 61 :250-260