Mobile-Sandbox: combining static and dynamic analysis with machine-learning techniques

被引:74
作者
Spreitzenbarth, Michael [1 ]
Schreck, Thomas [1 ]
Echtler, Florian [2 ]
Arp, Daniel [3 ]
Hoffmann, Johannes [4 ]
机构
[1] Univ Erlangen Nurnberg, D-91054 Erlangen, Germany
[2] Univ Regensburg, D-93053 Regensburg, Germany
[3] Univ Gottingen, D-37073 Gottingen, Germany
[4] Ruhr Univ Bochum, Bochum, Germany
关键词
Android; Malware; Automated analysis; Machine learning;
D O I
10.1007/s10207-014-0250-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Smartphones in general and Android in particular are increasingly shifting into the focus of cyber criminals. For understanding the threat to security and privacy, it is important for security researchers to analyze malicious software written for these systems. The exploding number of Android malware calls for automation in the analysis. In this paper, we present Mobile-Sandbox, a system designed to automatically analyze Android applications in novel ways: First, it combines static and dynamic analysis, i.e., results of static analysis are used to guide dynamic analysis and extend coverage of executed code. Additionally, it uses specific techniques to log calls to native (i.e., "non-Java") APIs, and last but not least it combines these results with machine-learning techniques to cluster the analyzed samples into benign and malicious ones. We evaluated the system on more than 69,000 applications from Asian third-party mobile markets and found that about 21 % of them actually use native calls in their code.
引用
收藏
页码:141 / 153
页数:13
相关论文
共 34 条
[1]  
Aafer Y., 2013, P INT C SEC PRIV COM
[2]  
Android Developers, 2012, US ANDR EM
[3]  
[Anonymous], P 18 ACM C COMP COMM
[4]  
[Anonymous], 2014, ANDR PLATF VERS
[5]  
[Anonymous], 2014, P NETW DISTR SYST SE
[6]  
[Anonymous], 2012, P 33 IEEE S SEC PRIV
[7]  
[Anonymous], [No title captured]
[8]  
[Anonymous], P 19 ANN S NETW DIST
[9]  
[Anonymous], 2011, P 1 ACM WORKSH SEC P
[10]  
Cristianini Nello, 2000, An introduction to support vector machines and other kernel-based learning methods