A Novel Solutions for Malicious Code Detection and Family Clustering Based on Machine Learning

被引:20
作者
Yang, Hangfeng [1 ]
Li, Shudong [1 ]
Wu, Xiaobo [2 ]
Lu, Hui [1 ]
Han, Weihong [1 ]
机构
[1] Guangzhou Univ, Cyberspace Inst Adv Technol, Guangzhou 510006, Peoples R China
[2] Guangzhou Univ, Sch Comp Sci & Cyber Engn, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Malware; ensemble model; malware classification; family clustering; t-SNE; KEY MANAGEMENT SCHEME; INTERNET;
D O I
10.1109/ACCESS.2019.2946482
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Malware has become a major threat to cyberspace security, not only because of the increasing complexity of malware itself, but also because of the continuously created and produced malicious code. In this paper, we propose two novel methods to solve the malware identification problem. One is to solve to malware classification. Different from traditional machine learning, our method introduces the ensemble models to solve the malware classification problem. The other is to solve malware family clustering. Different from the classic malware family clustering algorithm, our method introduces the t-SNE algorithm to visualize the feature data and then determines the number of malware families. The two proposed novel methods have been extensively tested on a large number of real-world malware samples. The results show that the first one is far superior to the existed individual models and the second one has a good adaptation ability. Our methods can be used for malicious code classification and family clustering, also with higher accuracy.
引用
收藏
页码:148853 / 148860
页数:8
相关论文
共 36 条
[1]  
Bayer U, 2006, J COMPUT VIROL HACKI, V2, P67, DOI 10.1007/s11416-006-0012-2
[2]  
Dahl GE, 2013, INT CONF ACOUST SPEE, P3422, DOI 10.1109/ICASSP.2013.6638293
[3]   Security in wireless sensor networks [J].
Du, Xiaojiang ;
Chen, Hsiao-Hwa .
IEEE WIRELESS COMMUNICATIONS, 2008, 15 (04) :60-66
[4]   An effective key management scheme for heterogeneous sensor networks [J].
Du, Xiaojiang ;
Xiao, Yang ;
Guizani, Mohsen ;
Chen, Hslao-Hwa .
AD HOC NETWORKS, 2007, 5 (01) :24-34
[5]   A Routing-Driven Elliptic Curve Cryptography Based Key Management Scheme for Heterogeneous Sensor Networks [J].
Du, Xiaojiang ;
Guizani, Mohsen ;
Xiao, Yang ;
Chen, Hsiao-Hwa .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2009, 8 (03) :1223-1229
[6]   Malware Detection Using Perceptrons and Support Vector Machines [J].
Gavrilut, Dragos ;
Cimpoesu, Mihai ;
Anton, Dan ;
Ciortuz, Liviu .
2009 COMPUTATION WORLD: FUTURE COMPUTING, SERVICE COMPUTATION, COGNITIVE, ADAPTIVE, CONTENT, PATTERNS, 2009, :283-288
[7]  
Gove R., 2014, Proc. VizSec, P72, DOI DOI 10.1145/2671491.2671496
[8]  
Han B, 2014, SCI WORLD J, DOI [10.1155/2014/724804, 10.1155/2014/132713]
[9]  
Islam R., 2010, Proceedings Second Cybercrime and Trustworthy Computing Workshop (CTC 2010), P9, DOI 10.1109/CTC.2010.11
[10]  
Kolter JZ, 2006, J MACH LEARN RES, V7, P2721