SpamHunting:: An instance-based reasoning system for spam labelling and filtering

被引:39
作者
Fdez-Riverola, F.
Iglesias, E. L.
Diaz, F.
Mendez, J. R.
Corchado, J. M.
机构
[1] Univ Vigo, Escuela Super Ingn Informat, Dept Informat, Orense 32004, Spain
[2] Univ Valladolid, Dept Informat, Escuela Univ Informat, Segovia 40005, Spain
[3] Univ Salamanca, Dept Informat & Automat, E-37008 Salamanca, Spain
关键词
IBR system; automatic reasoning; anti-spam filtering; model comparison;
D O I
10.1016/j.dss.2006.11.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we show an instance-based reasoning e-mail filtering model that outperforms classical machine learning techniques and other successful lazy learners approaches in the domain of anti-spam filtering. The architecture of the learning-based anti-spain filter is based on a tuneable enhanced instance retrieval network able to accurately generalize e-mail representations. The reuse of similar messages is carried out by a simple unanimous voting mechanism to determine whether the target case is spam or not. Previous to the final response of the system, the revision stage is only performed when the assigned class is spam whereby the system employs general knowledge in the form of meta-rules. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:722 / 736
页数:15
相关论文
共 38 条
[11]  
DAELEMANS W, 1999, TIMBL TILBURG MEMORY
[12]  
DELANY SJ, 2004, P 24 SGAI INT C INN
[13]  
DELANY SJ, 2004, P 15 IR C ART INT CO
[14]   Support vector machines for spam categorization [J].
Drucker, H ;
Wu, DH ;
Vapnik, VN .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (05) :1048-1054
[15]  
FALLOWS D, 2004, P 1 C EM ANT MOUNT V
[16]   CBR based system for forecasting red tides [J].
Fdez-Riverola, F ;
Corchado, JM .
KNOWLEDGE-BASED SYSTEMS, 2003, 16 (5-6) :321-328
[17]   FSfRT: Forecasting system for red tides [J].
Fdez-Riverola, F ;
Corchado, JM .
APPLIED INTELLIGENCE, 2004, 21 (03) :251-264
[18]   Additive logistic regression: A statistical view of boosting - Rejoinder [J].
Friedman, J ;
Hastie, T ;
Tibshirani, R .
ANNALS OF STATISTICS, 2000, 28 (02) :400-407
[19]  
GOMEZ JM, 2003, PROCESAMIENTO LENGUA, V31, P13
[20]  
GRAY A, 2004, P 1 C EM ANT MOUNT V