Concept-drift detection index based on fuzzy formal concept analysis for fake news classifiers

被引:13
作者
Fenza, Giuseppe [1 ]
Gallo, Mariacristina [1 ]
Loia, Vincenzo [1 ]
Petrone, Alessandra [2 ]
Stanzione, Claudio [3 ]
机构
[1] Univ Salerno, Dept Management & Innovat Syst, I-84084 Fisciano, SA, Italy
[2] Univ Salerno, Dept Polit Social & Commun Sci, I-84084 Fisciano, SA, Italy
[3] Ctr Higher Def Studies, Def Anal Res Inst, I-00165 Rome, RM, Italy
关键词
Concept drift; Machine learning; Fuzzy formal concept analysis; Fake news; Text classification; ONLINE;
D O I
10.1016/j.techfore.2023.122640
中图分类号
F [经济];
学科分类号
02 ;
摘要
Unpredictable changes in the underlying distribution of the streaming data over time are known as concept drift. The development of procedures and techniques for drift detection, interpretation, and adaptation is central to concept-drift research. Data research has demonstrated that machine learning in a concept-drift environment produces poor learning results if drift is not handled. This study focuses on defining the concept-drift detection index to predict when the performance of a machine learning model for text-stream classifiers is low. It proposes an index that relies on the Fuzzy Formal Concept Analysis theory. The index exploits the formal lattice to understand whether new incoming facts (e.g., news) are well represented in the training data used to build the machine-learning model. Fake news was deemed ideal for testing this new measure because its typical application scenario required handling a stream of unstructured content and concept-drift awareness. Experiments on three news datasets revealed a relevant correlation (i.e., 73.9 %, 80.8 %, and 81 %) between the Accuracy of Random Forest (RF), Naive Bayes (NB), and Passive Aggressive (PA) models, respectively, and the proposed index. This strong correlation suggests that the new index can avoid incorrect classifications and help in retraining decisions.
引用
收藏
页数:11
相关论文
共 47 条
[1]   ElStream: An Ensemble Learning Approach for Concept Drift Detection in Dynamic Social Big Data Stream Learning [J].
Abbasi, Ahmad ;
Javed, Abdul Rehman ;
Chakraborty, Chinmay ;
Nebhen, Jamel ;
Zehra, Wisha ;
Jalil, Zunera .
IEEE ACCESS, 2021, 9 :66408-66419
[2]  
Alassad M., 2019, INT C MOD SIM SOC BE, P41
[3]   Just-in-time adaptive classifiers - Part I: Detecting nonstationary changes [J].
Alippi, Cesare ;
Roveri, Manuel .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (07) :1145-1153
[4]  
Baena-Garcia M., 2006, 4 INT WORKSH KNOWL D, P77
[5]  
Bangerter M.L., 2021, 2021 IEEE INT C COMP, P1
[6]  
Bechini A., 2021, ADDRESSING EVENT DRI
[7]  
Benesty Jacob, 2009, The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation, P1, DOI DOI 10.4135/9781506326139
[8]   An exploration of how fake news is taking over social media and putting public health at risk [J].
Bin Naeem, Salman ;
Bhatti, Rubina ;
Khan, Aqsa .
HEALTH INFORMATION AND LIBRARIES JOURNAL, 2021, 38 (02) :143-149
[9]  
Dasu T., 2006, In Proc. Symp. on the Interface of Statistics, Computing Science, and Applications
[10]  
De Maio C, 2015, IEEE IJCNN