Catching the drift: Using feature-free case-based reasoning for spam filtering

被引:0
|
作者
Delany, Sarah Jane [1 ]
Bridge, Derek [2 ]
机构
[1] Dublin Inst Technol, Dublin, Ireland
[2] Univ Coll Cork, Cork, Ireland
来源
CASE-BASED REASONING RESEARCH AND DEVELOPMENT, PROCEEDINGS | 2007年 / 4626卷
关键词
TRACKING CONCEPT DRIFT; SELECTION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we compare case-based spam filters, focusing on their resilience to concept drift. In particular, we evaluate how to track concept drift using a case-based spam filter that uses a feature-free distance measure based on text compression. In our experiments, we compare two ways to normalise such a distance measure, finding that the one proposed in [1] performs better. We show that a policy as simple as retaining misclassified examples has a hugely beneficial effect on handling concept drift in spam but, on its own, it results in the case base growing by over 30%. We then compare two different retention policies and two different forgetting policies (one a form of instance selection, the other a form of instance weighting) and find that they perform roughly as well as each other while keeping the case base size constant. Finally, we compare a feature-based textual case-based spam filter with our feature-free approach. In the face of concept drift, the feature-based approach requires the case base to be rebuilt periodically so that we can select a new feature set that better predicts the target concept. We find feature-free approaches to have lower error rates than their feature-based equivalents.
引用
收藏
页码:314 / +
页数:4
相关论文
共 48 条
  • [31] A case-based reasoning system for recommendation of data cleaning algorithms in classification and regression tasks
    Camilo Corrales, David
    Ledezma, Agapito
    Carlos Corrales, Juan
    APPLIED SOFT COMPUTING, 2020, 90
  • [32] Case-Based Reasoning System for Aeroengine Fault Diagnosis Enhanced with Attitudinal Choquet Integral
    Chen, Mengqi
    Xia, Jingyang
    Huang, Ruoyun
    Fang, Weiguo
    APPLIED SCIENCES-BASEL, 2022, 12 (11):
  • [33] An artificial immune systems approach to Case-based Reasoning applied to fault detection and diagnosis
    Silva, Guilherme Costa
    Carvalho, Eduardo E. O.
    Caminhas, Walmir Matos
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 140
  • [34] A data-driven explainable case-based reasoning approach for financial risk detection
    Li, Wei
    Paraschiv, Florentina
    Sermpinis, Georgios
    QUANTITATIVE FINANCE, 2022, 22 (12) : 2257 - 2274
  • [35] An Intuitionistic Fuzzy Stochastic Decision-Making Method Based on Case-Based Reasoning and Prospect Theory
    Li, Peng
    Yang, Yingjie
    Wei, Cuiping
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2017, 2017
  • [36] Hourly prediction of a building's electricity consumption using case-based reasoning, artificial neural networks and principal component analysis
    Platon, Radu
    Dehkordi, Vahid Raissi
    Martel, Jacques
    ENERGY AND BUILDINGS, 2015, 92 : 10 - 18
  • [37] A self-adaptive case-based reasoning system for dose planning in prostate cancer radiotherapy
    Mishra, Nishikant
    Petrovic, Sanja
    Sundar, Santhanam
    MEDICAL PHYSICS, 2011, 38 (12) : 6528 - 6538
  • [38] Improving case-based reasoning with the aid of multi-criteria and group decision-making methods
    Maltugueva, Galina S.
    Yurin, Alexander Yu.
    2019 42ND INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2019, : 1031 - 1036
  • [39] Integrating gray system theory and logistic regression into case-based reasoning for safety assessment of thermal power plants
    Liang, Changyong
    Gu, Dongxiao
    Bichindaritz, Isabelle
    Li, Xingguo
    Zuo, Chunrong
    Cheng, Wenen
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (05) : 5154 - 5167
  • [40] Transfer-Learning-Based Opinion Mining for New-Product Portfolio Configuration over the Case-Based Reasoning Cycle
    Li, Shui Ming
    Lee, Carman Ka Man
    APPLIED SCIENCES-BASEL, 2022, 12 (23):