A spam filtering method learning from Web browsing behavior

被引:0
作者
Takashita, Taiki [1 ]
Itokawa, Tsuyoshi [1 ]
Kitasuka, Teruaki [1 ]
Aritsugi, Masayoshi [1 ]
机构
[1] Kumamoto Univ, Grad Sch Sci & Technol, Dept Comp Sci & Commun Engn, Kumamoto 8608555, Japan
来源
KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS | 2008年 / 5178卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper a spam filtering method is proposed. We focus on user behavior that most email users browse the Web. The method reduces troublesome maintenance of the spam filter, since the filter learns from Web browsing behavior in the background. The method uses Web browsing behavior of each user to learn ham words. Ham words are, picked up from browsed Web pages using TF-IDP and stored in the database called ham words list. For each received email, the method extracts keywords from the email including Web pages of the URLs. If some keywords axe in the ham words list, the email is treated as a ham. In our. experiments, several spam emails which cannot be deflected by a Bayesian filter are detected as spams.
引用
收藏
页码:774 / 781
页数:8
相关论文
共 9 条
[1]  
ANDROUTSOPOULOS I, 2000, P 23 ANN INT ACM SIG, P160
[2]  
[Anonymous], 2002, A plan for spam
[3]  
BUDZIK J, 2000, P 5 INT C INT US INT, P44
[4]  
CUNNINGHAM P, 2003, P ICCBR 2003 WORKSH
[5]  
Goodman J, 2007, COMMUN ACM, V50, P25
[6]  
KUMAGAI N, 2005, P 21 INT C DAT ENG W, P1172
[7]  
*MOZ, THUNDERBIRD
[8]   TERM-WEIGHTING APPROACHES IN AUTOMATIC TEXT RETRIEVAL [J].
SALTON, G ;
BUCKLEY, C .
INFORMATION PROCESSING & MANAGEMENT, 1988, 24 (05) :513-523
[9]  
*YAH INC, YAH SEARCH WEB SERV