Understanding Android Obfuscation Techniques: A Large-Scale Investigation in the Wild

被引:69
作者
Dong, Shuaike [1 ]
Li, Menghao [2 ]
Diao, Wenrui [3 ]
Liu, Xiangyu [4 ]
Liu, Jian [2 ]
Li, Zhou
Xu, Fenghao [1 ]
Chen, Kai [2 ]
Wang, XiaoFeng [5 ]
Zhang, Kehuan [1 ]
机构
[1] Chinese Univ Hong Kong, Sha Tin, Hong Kong, Peoples R China
[2] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[3] Jinan Univ, Guangzhou, Peoples R China
[4] Alibaba Inc, Hangzhou, Peoples R China
[5] Indiana Univ, Bloomington, IN USA
来源
SECURITY AND PRIVACY IN COMMUNICATION NETWORKS, SECURECOMM 2018, PT I | 2018年 / 254卷
基金
中国国家自然科学基金;
关键词
Android; Obfuscation; Static analysis; Code protection;
D O I
10.1007/978-3-030-01701-9_10
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Program code is a valuable asset to its owner. Due to the easy-to-reverse nature of Java, code protection for Android apps is of particular importance. To this end, code obfuscation is widely utilized by both legitimate app developers and malware authors, which complicates the representation of source code or machine code in order to hinder the manual investigation and code analysis. Despite many previous studies focusing on the obfuscation techniques, however, our knowledge of how obfuscation is applied by real-world developers is still limited. In this paper, we seek to better understand Android obfuscation and depict a holistic view of the usage of obfuscation through a large-scale investigation in the wild. In particular, we focus on three popular obfuscation approaches: identifier renaming, string encryption and Java reflection. To obtain the meaningful statistical results, we designed efficient and lightweight detection models for each obfuscation technique and applied them to our massive APK datasets (collected from Google Play, multiple third-party markets, and malware databases). We have learned several interesting facts from the result. For example, more apps on third-party markets than malware use identifier renaming, and malware authors use string encryption more frequently. We are also interested in the explanation of each finding. Therefore we carry out in-depth code analysis on some Android apps after sampling. We believe our study will help developers select the most suitable obfuscation approach, and in the meantime help researchers improve code analysis systems in the right direction.
引用
收藏
页码:172 / 192
页数:21
相关论文
共 22 条
[1]  
[Anonymous], 2014, WISEC
[2]  
Apvrille A., 2014, VIRUS B, P1
[3]   Control flow obfuscation for Android applications [J].
Balachandran, Vivek ;
Sufatrio ;
Tan, Darell J. J. ;
Thing, Vrizlynn L. L. .
COMPUTERS & SECURITY, 2016, 61 :72-93
[4]   Statistical Deobfuscation of Android Applications [J].
Bichsel, Benjamin ;
Raychev, Veselin ;
Tsankov, Petar ;
Vechev, Martin .
CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :343-355
[5]  
Calvet J., 2012, P 2012 ACM C COMP CO, P169, DOI [10.1145/2382196.2382217, DOI 10.1145/2382196.2382217]
[6]   Achieving Accuracy and Scalability Simultaneously in Detecting Application Clones on Android Markets [J].
Chen, Kai ;
Liu, Peng ;
Zhang, Yingjun .
36TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2014), 2014, :175-186
[7]   Testing android malware detectors against code obfuscation: a systematization of knowledge and unified methodology [J].
Preda M.D. ;
Maggi F. .
Journal of Computer Virology and Hacking Techniques, 2017, 13 (3) :209-232
[8]   Things You May Not Know About Android (Un)Packers: A Systematic Study based on Whole-System Emulation [J].
Duan, Yue ;
Zhang, Mu ;
Bhaskar, Abhishek Vasisht ;
Yin, Heng ;
Pan, Xiaorui ;
Li, Tongxin ;
Wang, Xueqiang ;
Wang, XiaoFeng .
25TH ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2018), 2018,
[9]   An Empirical Evaluation of Software Obfuscation Techniques Applied to Android APKs [J].
Freiling, Felix C. ;
Protsenko, Mykola ;
Zhuang, Yan .
INTERNATIONAL CONFERENCE ON SECURITY AND PRIVACY IN COMMUNICATION NETWORKS, SECURECOMM 2014, PT II, 2015, 153 :315-328
[10]  
Gröbert F, 2011, LECT NOTES COMPUT SC, V6961, P41, DOI 10.1007/978-3-642-23644-0_3