An intelligent system for spam detection and identification of the most relevant features based on evolutionary Random Weight Networks

被引:182
作者
Faris, Hossam [1 ]
Al-Zoubi, Ala M. [1 ]
Heidari, Ali Asghar [2 ]
Aljarah, Ibrahim [1 ]
Mafarja, Majdi [3 ]
Hassonah, Mohammad A. [1 ]
Fujita, Hamido [4 ]
机构
[1] Univ Jordan, King Abdullah II Sch Informat Technol, Amman, Jordan
[2] Univ Tehran, Sch Surveying & Geospatial Engn, Tehran, Iran
[3] Birzeit Univ, Dept Comp Sci, Birzeit, Palestine
[4] IPU, Fac Software & Informat Sci, Takizawa, Iwate, Japan
关键词
Spam filtering; Email spam detection; Feature analysis; Hybrid machine learning; Evolutionary; Random Weight Network; Feature selection; NEGATIVE SELECTION ALGORITHM; GENETIC ALGORITHM; NEURAL-NETWORKS; DETECTION MODEL; OPTIMIZATION; CLASSIFICATION; SCHEME;
D O I
10.1016/j.inffus.2018.08.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the incremental use of emails as an essential and popular communication mean over the Internet, there comes a serious threat that impacts the Internet and the society. This problem is known as spam. By receiving spam messages, Internet users are exposed to security issues, and minors are exposed to inappropriate contents. Moreover, spam messages waste resources in terms of storage, bandwidth, and productivity. What makes the problem worse is that spammers keep inventing new techniques to dodge spam filters. On the other side, the massive data flow of hundreds of millions of individuals, and the large number of attributes make the problem more cumbersome and complex. Therefore, proposing evolutionary and adaptable spam detection models becomes a necessity. In this paper, an intelligent detection system that is based on Genetic Algorithm (GA) and Random Weight Network (RWN) is proposed to deal with email spam detection tasks. In addition, an automatic identification capability is also embedded in the proposed system to detect the most relevant features during the detection process. The proposed system is intensively evaluated through a series of extensive experiments based on three email corpora. The experimental results confirm that the proposed system can achieve remarkable results in terms of accuracy, precision, and recall. Furthermore, the proposed detection system can automatically identify the most relevant features of the spam emails.
引用
收藏
页码:67 / 83
页数:17
相关论文
共 58 条
[1]   Voting-based Classification for E-mail Spam Detection [J].
Al-Shboul, Bashar ;
Hakh, Heba ;
Faris, Hossam ;
Aljarah, Ibrahim ;
Alsawalqah, Hamad .
JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2016, 10 (01) :29-42
[2]   Evolving Support Vector Machines using Whale Optimization Algorithm for spam profiles detection on online social networks in different lingual contexts [J].
Al-Zoubi, Ala' M. ;
Faris, Hossam ;
Alqatawna, Ja'far ;
Hassonah, Mohammad A. .
KNOWLEDGE-BASED SYSTEMS, 2018, 153 :91-104
[3]  
Alqatawna J., 2015, Int. J. Commun. Network Syst. Sci, V8, P118
[4]   A study of spam filtering using support vector machines [J].
Amayri, Ola ;
Bouguila, Nizar .
ARTIFICIAL INTELLIGENCE REVIEW, 2010, 34 (01) :73-108
[5]  
[Anonymous], 2017, P 2017 IEEE JORD C A
[6]  
[Anonymous], 1991, F GENETIC ALGORITHMS
[7]  
[Anonymous], 2006, P 3 C EM ANT CEAS 20
[8]  
[Anonymous], 2018, KNOWL BASED SYST
[9]  
Aski Ali Shafigh, 2016, Pacific Science Review A: Natural Science and Engineering, V18, P145, DOI 10.1016/j.psra.2016.09.017
[10]   A survey of learning-based techniques of email spam filtering [J].
Blanzieri, Enrico ;
Bryl, Anton .
ARTIFICIAL INTELLIGENCE REVIEW, 2008, 29 (01) :63-92