EC2: Ensemble Clustering and Classification for Predicting Android Malware Families

被引:67
作者
Chakraborty, Tanmoy [1 ]
Pierazzi, Fabio [2 ]
Subrahmanian, V. S. [3 ]
机构
[1] Delhi IIIT D, Indraprastha Inst Informat Technol, Dept Comp Sci & Engn, Delhi 110020, India
[2] Univ Modena & Reggio Emilia, Dept Engn Enzo Ferrari, I-41125 Modena, Italy
[3] Dartmouth Coll, Dept Comp Sci, Hanover, NH 03755 USA
关键词
Malware; Androids; Humanoid robots; Feature extraction; Smart phones; Mobile communication; Clustering algorithms; Android; malware; ensemble; classification; clustering;
D O I
10.1109/TDSC.2017.2739145
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As the most widely used mobile platform, Android is also the biggest target for mobile malware. Given the increasing number of Android malware variants, detecting malware families is crucial so that security analysts can identify situations where signatures of a known malware family can be adapted as opposed to manually inspecting behavior of all samples. We present EC2 (Ensemble Clustering and Classification), a novel algorithm for discovering Android malware families of varying sizes-ranging from very large to very small families (even if previously unseen). We present a performance comparison of several traditional classification and clustering algorithms for Android malware family identification on DREBIN, the largest public Android malware dataset with labeled families. We use the output of both supervised classifiers and unsupervised clustering to design EC2. Experimental results on both the DREBIN and the more recent Koodous malware datasets show that EC2 accurately detects both small and large families, outperforming several comparative baselines. Furthermore, we show how to automatically characterize and explain unique behaviors of specific malware families, such as FakeInstaller, MobileTx, Geinimi. In short, EC2 presents an early warning system for emerging new malware families, as well as a robust predictor of the family (when it is not new) to which a new malware sample belongs, and the design of novel strategies for data-driven understanding of malware behaviors.
引用
收藏
页码:262 / 277
页数:16
相关论文
共 53 条
[1]  
Aafer Y, 2013, L N INST COMP SCI SO, V127, P86
[2]  
Andronio Nicolo, 2015, Research in Attacks, Intrusions and Defenses. 18th International Symposium, RAID 2015. Proceedings: LNCS 9404, P382, DOI 10.1007/978-3-319-26362-5_18
[3]  
Aresu M, 2015, 2015 10TH INTERNATIONAL CONFERENCE ON MALICIOUS AND UNWANTED SOFTWARE (MALWARE), P128, DOI 10.1109/MALWARE.2015.7413693
[4]   Drebin: Effective and Explainable Detection of Android Malware in Your Pocket [J].
Arp, Daniel ;
Spreitzenbarth, Michael ;
Huebner, Malte ;
Gascon, Hugo ;
Rieck, Konrad .
21ST ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2014), 2014,
[5]  
Baldangombo U, 2013, arXiv preprint arXiv:1308.2831
[6]  
Battista Pasquale, 2016, ICISSP 2016. 2nd International Conference on Information Systems Security and Privacy. Proceedings, P542
[7]  
Bayer U., 2009, NETW DISTRIB SYST SE
[8]  
Biggio B., 2014, Proceedings of the 2014 Workshop on Artificial Intelligent and Security Workshop - AISec'14, P27, DOI DOI 10.1145/2666652.2666666
[9]   Code obfuscation techniques for metamorphic viruses [J].
Borello, Jean-Marie ;
Me, Ludovic .
JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES, 2008, 4 (03) :211-220
[10]  
Bugiel S., 2011, TECH REP T 2011 04