Big Data Framework for Zero-Day Malware Detection

被引:17
作者
Gupta, Deepak [1 ]
Rani, Rinkle [1 ]
机构
[1] Thapar Univ, Dept Comp Sci & Engn, Patiala, Punjab, India
关键词
Apache Spark; big data; machine learning; malware detection; MLlib;
D O I
10.1080/01969722.2018.1429835
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Malware has already been recognized as one of the most dominant cyber threats on the Internet today. It is growing exponentially in terms of volume, variety, and velocity, and thus overwhelms the traditional approaches used for malware detection and classification. Moreover, with the advent of Internet of Things, there is a huge growth in the volume of digital devices and in such scenario, malicious binaries are bound to grow even faster making it a big data problem. To analyze and detect unknown malware on a large scale, security analysts need to make use of machine learning algorithms along with big data technologies. These technologies help them to deal with current threat landscape consisting of complex and large flux of malicious binaries. This paper proposes the design of a scalable architecture built on the top of Apache Spark which uses its scalable machine learning library (MLlib) for detecting zero-day malware. The proposed platform is tested and evaluated on a dataset comprising of 0.2 million files consisting of 0.05 million clean files and 0.15 million malicious binaries covering a large number of malware families over a period of 7 years starting from 2010.
引用
收藏
页码:103 / 121
页数:19
相关论文
共 32 条
[1]  
Ahn SH, 2014, INT CONF ADV COMMUN, DOI 10.1109/ICACT.2014.6778962
[2]  
Anderson Blake., 2012, P 5 ACM WORKSHOP SEC, P3
[3]  
[Anonymous], J MACHINE LEARNING R
[4]  
[Anonymous], P 4 ANN S CLOUD COMP, DOI [10.1145/2523616.2523633, DOI 10.1145/2523616.2523633]
[5]  
[Anonymous], 2008, J SYST CYBERN INFORM
[6]  
[Anonymous], P 6 ANN WORKSH CYB S
[7]  
[Anonymous], 2016, INT SEC THREAT REP
[8]  
[Anonymous], 2010, P USENIX WORKSH HOT
[9]  
[Anonymous], 2014, P 7 INT C SECURITY I
[10]  
[Anonymous], 2012, NSDI