Multilingual E-mail Classification using Bayesian Filtering and Language Translation

被引:0
作者
Banday, M. Tariq [1 ,2 ]
Sheikh, Shafiya Afzal [1 ,2 ]
机构
[1] Univ Kashmir, Dept Elect, Srinagar, Jammu & Kashmir, India
[2] Univ Kashmir, Inst Technol, Srinagar, Jammu & Kashmir, India
来源
2014 INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I) | 2014年
关键词
SPAM; HAM; Filtering; E-mail; Multilingual; Online Language Translation;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
E-mail SPAM is continuously growing threat to its users, E-mail Service Providers (ESPs) and Internet Service Providers (ISPs) as it consumes user's mailboxes, bandwidth, and time by flooding the system with unwanted and unsolicited messages. It can contain unsafe content such as virus programs, phishing frauds, and other malicious code that can be used to hatch varied types of attacks. Several techniques and tools including anti-spam filters are employed to filter out spam e-mails at servers and clients. This paper reviews methods and techniques used to filter spam e-mails currently employed at major e-mail service providers and evaluates their performance to filter non-English language e-mail messages. It proposes a technique to build a translation module that can be used to augment current spam filters to enable them to filter spam from non-English language e-mail messages. It permits the spam filter to train itself through training data set in chosen language and tune its parameters with every incoming message. The implementation of the technique through a translation module and experiments using a publicly available e-mail data corpus have successfully validated the correctness and working of the proposed technique.
引用
收藏
页码:696 / 701
页数:6
相关论文
共 29 条
[1]  
[Anonymous], WORKING PAPER
[2]  
[Anonymous], A PLAN FOR SPAM
[3]  
[Anonymous], 2004, CEAS
[4]  
[Anonymous], 2006, P 2006 AUSTRALASIAN
[5]  
[Anonymous], 1995, 3 INT C STAT AN TEXT
[6]  
[Anonymous], 1998, AAAI WORKSH LEARN TE
[7]  
Attenberg Weinberger, 2009, CEAS 2009 6 C EM ANT
[8]   Realization of Microsoft Outlook® Add-in for Language Based E-mail Folder Classification [J].
Banday, M. Tariq ;
Sheikh, Shafiya Afzal .
2013 INTERNATIONAL CONFERENCE ON MACHINE INTELLIGENCE AND RESEARCH ADVANCEMENT (ICMIRA 2013), 2013, :279-284
[9]  
BANDAY MT, 2013, 3 INT ECONFERENCE CO, P59
[10]  
Cavnar W., 1994, Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, V3, P161