CoAID-DEEP: An Optimized Intelligent Framework for Automated Detecting COVID-19 Misleading Information on Twitter

被引:69
作者
Abdelminaam, Diaa Salama [1 ,2 ]
Ismail, Fatma Helmy [2 ]
Taha, Mohamed [1 ]
Taha, Ahmed [1 ]
Houssein, Essam H. [3 ]
Nabil, Ayman [2 ]
机构
[1] Benha Univ, Fac Comp & Artificial Intelligence, Banha 13511, Egypt
[2] Misr Int Univ, Fac Comp Sci, Cairo 11341, Egypt
[3] Minia Univ, Fac Comp & Informat, Al Minya 61519, Egypt
来源
IEEE ACCESS | 2021年 / 9卷
关键词
COVID-19; Social networking (online); Feature extraction; Deep learning; Blogs; Viruses (medical); Organizations; Fake news; misleading information; pandemic; social media; deep learning;
D O I
10.1109/ACCESS.2021.3058066
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
COVID-19 has affected all peoples' lives. Though COVID-19 is on the rising, the existence of misinformation about the virus also grows in parallel. Additionally, the spread of misinformation has created confusion among people, caused disturbances in society, and even led to deaths. Social media is central to our daily lives. The Internet has become a significant source of knowledge. Owing to the widespread damage caused by fake news, it is important to build computerized systems to detect fake news. The paper proposes an updated deep neural network for identification of false news. The deep learning techniques are The Modified-LSTM (one to three layers) and The Modified GRU (one to three layers). In particular, we carry out investigations of a large dataset of tweets passing on data with respect to COVID-19. In our study, we separate the dubious claims into two categories: true and false. We compare the performance of the various algorithms in terms of prediction accuracy. The six machine learning techniques are decision trees, logistic regression, k nearest neighbors, random forests, support vector machines, and naive Bayes (NB). The parameters of deep learning techniques are optimized using Keras-tuner. Four Benchmark datasets were used. Two feature extraction methods were used (TF-ID with N-gram) to extract essential features from the four benchmark datasets for the baseline machine learning model and word embedding feature extraction method for the proposed deep neural network methods. The results obtained with the proposed framework reveal high accuracy in detecting Fake and non-Fake tweets containing COVID-19 information. These results demonstrate significant improvement as compared to the existing state of art results of baseline machine learning models. In our approach, we classify the data into two categories: fake or nonfake. We compare the execution of the proposed approaches with Six machine learning procedures. The six machine learning procedures are Decision Tree (DT), Logistic Regression (LR), K Nearest Neighbor (KNN), Random Forest (RF), Support Vector Machine (SVM), and Naive Bayes (NB). The parameters of deep learning techniques are optimized using Keras-tuner. Four Benchmark datasets were used. Two feature extraction methods were used (TF-ID with N-gram) to extract essential features from the four benchmark datasets for the baseline machine learning model and word embedding feature extraction method for the proposed deep neural network methods. The results obtained with the proposed framework reveal high accuracy in detecting Fake and non-Fake tweets containing COVID-19 information. These results demonstrate significant improvement as compared to the existing state of art results of baseline machine learning models.
引用
收藏
页码:27840 / 27867
页数:28
相关论文
共 74 条
  • [1] Adali S., 2017, PROC INT AAAI C WEB
  • [2] Detecting Hoaxes, Frauds, and Deception in Writing Style Online
    Afroz, Sadia
    Brennan, Michael
    Greenstadt, Rachel
    [J]. 2012 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2012, : 461 - 475
  • [3] Fake News Identification on Twitter with Hybrid CNN and RNN Models
    Ajao, Oluwaseun
    Bhowmik, Deepayan
    Zargari, Shahrzad
    [J]. SMSOCIETY'18: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON SOCIAL MEDIA AND SOCIETY, 2018, : 226 - 230
  • [4] Lies Kill, Facts Save: Detecting COVID-19 Misinformation in Twitter
    Al-Rakhami, Mabrook S.
    Al-Amri, Atif M.
    [J]. IEEE ACCESS, 2020, 8 : 155961 - 155970
  • [5] Social Media and Fake News in the 2016 Election
    Allcott, Hunt
    Gentzkow, Matthew
    [J]. JOURNAL OF ECONOMIC PERSPECTIVES, 2017, 31 (02) : 211 - 235
  • [6] [Anonymous], 2017, ARXIV170500648
  • [7] [Anonymous], 2018, DEEP LEARNING APPL U
  • [8] Long short-term memory
    Hochreiter, S
    Schmidhuber, J
    [J]. NEURAL COMPUTATION, 1997, 9 (08) : 1735 - 1780
  • [9] [Anonymous], 2018, P NAACL HLT
  • [10] [Anonymous], 2013, P IEEE INT C AC SPEE