A Probabilistic Discriminative Model for Android Malware Detection with Decompiled Source Code

被引:108
作者
Cen, Lei [1 ]
Gates, Christoher S. [2 ]
Si, Luo [2 ]
Li, Ninghui [2 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
Android; malicious application; machine learning; discriminative model; CLASSIFICATION;
D O I
10.1109/TDSC.2014.2355839
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Mobile devices are an important part of our everyday lives, and the Android platform has become a market leader. In recent years a number of approaches for Android malware detection have been proposed, using permissions, source code analysis, or dynamic analysis. In this paper, we propose to use a probabilistic discriminative model based on regularized logistic regression for Android malware detection. Through extensive experimental evaluation, we demonstrate that it can generate probabilistic outputs with highly accurate classification results. In particular, we propose to use Android API calls as features extracted from decompiled source code, and analyze and explore issues in feature granularity, feature representation, feature selection, and regularization. We show that the probabilistic discriminative model also works well with permissions, and substantially outperforms the state-of-the-art methods for Android malware detection with application permissions. Furthermore, the discriminative learning model achieves the best detection results by combining both decompiled source code and application permissions. To the best of our knowledge, this is the first research that proposes probabilistic discriminative model for Android malware detection with a thorough study of desired representation of decompiled source code and is the first research work for Android malware detection task that combines both analysis of decompiled source code and application permissions.
引用
收藏
页码:400 / 412
页数:13
相关论文
共 32 条
  • [1] Aafer Y, 2013, L N INST COMP SCI SO, V127, P86
  • [2] [Anonymous], 1997, ICML
  • [3] [Anonymous], PATTERN RECOGNIT LET
  • [4] [Anonymous], 2006, P 23 INT C MACHINE L, DOI DOI 10.1145/1143844.1143874
  • [5] [Anonymous], 2007, ESEC FSE 2007
  • [6] Au K. W. Y., 2012, ACM C COMP COMM SEC, P217, DOI [10.1145/2382196.2382222, DOI 10.1145/2382196.2382222]
  • [7] Axelsson S., 2000, ACM Transactions on Information and Systems Security, V3, P186, DOI 10.1145/357830.357849
  • [8] Bailey M, 2007, LECT NOTES COMPUT SC, V4637, P178
  • [9] Boyd S., 2004, CONVEX OPTIMIZATION
  • [10] Burguera I., 2011, SPSM, P15, DOI [10.1145/2046614.2046619, DOI 10.1145/2046614.2046619]