Demystifying the Evolution of Android Malware Variants

被引:0
作者
Tang, Lihong [1 ]
Chen, Xiao [2 ]
Wen, Sheng [1 ]
Li, Li [3 ]
Grobler, Marthie [4 ]
Xiang, Yang [1 ]
机构
[1] Swinburne Univ Technol, Hawthorn, Vic 3122, Australia
[2] Monash Univ, Dept Software Syst & Cybersecur, Clayton, Vic 3800, Australia
[3] Beihang Univ, Sch Software, Beijing 100191, Peoples R China
[4] CSIROs Data61, Clayton, Vic 3168, Australia
关键词
Android; malware; variants; evolution; phylogeny;
D O I
10.1109/TDSC.2023.3325912
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
It is important to understand the evolution of Android malware as this facilitates the development of defence techniques by proactively capturing malware features. So far, researchers mainly rely on dendrogram or family-tree analysis for malware's evolutionary development. However, our research finds that these techniques cannot support comprehensive malware evolution modelling, which provides a detailed explanation for why Android malware samples evolve in specific ways. This shortcoming is mainly caused by the coarse-grained clustering and analysis of malware samples. For example, because these works do not divide malware samples of a family into variant sets and explore the evolution principles among those sets, they usually fail to capture new variants that have been empowered by the feature 'drifting' in evolution. To address this problem, we propose a fine-grained and in-depth analysis of Android malware. Our experimental work systematically reveals the phylogenetic relationships among the variant sets for a deeper malware evolution analysis. We introduce five metrics: silhouette coefficient, creation date, variant labels, the presentativeness of the variant set formula, and the correctness of the linked edges to evaluate the correctness of our analysis. The results show that our variant clustering achieved a high silhouette value at a small sample distance (0.3), a small standard deviation (three months and 16 days) date based on when the malware samples are lastly modified, a high label consistency (91.4%), a high representativeness (93.1%) of the variant set formula. All the linked variant sets are connected based on our PhyloNet construction rules. We further analyse the coding details of Android malware for each variant set and summarise models of their evolutionary development. In this work, we successfully expose two major models of malware evolution: active evolution and passive evolution. We also disclose four technical explanations on the incentives of the two evolution models (two for each model respectively). These findings are valuable for proactive defence against newly emerged malware samples.
引用
收藏
页码:3324 / 3341
页数:18
相关论文
共 52 条
  • [1] Going Native: Using a Large-Scale Analysis of Android Apps to Create a Practical Native-Code Sandboxing Policy
    Afonso, Vitor
    Bianchi, Antonio
    Fratantonio, Yanick
    Doupe, Adam
    Polino, Mario
    de Geus, Paulo
    Kruegel, Christopher
    Vigna, Giovanni
    [J]. 23RD ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2016), 2016,
  • [2] DroidNative: Automating and optimizing detection of Android native code malware variants
    Alam, Shahid
    Qu, Zhengyang
    Riley, Ryan
    Chen, Yan
    Rastogi, Vaibhav
    [J]. COMPUTERS & SECURITY, 2017, 65 : 230 - 246
  • [3] [Anonymous], 2014, Virustotal contributors
  • [4] [Anonymous], 2019, Internet security threat report
  • [5] [Anonymous], 2023, Malware statistics & trends report| AV-test
  • [6] [Anonymous], 2023, Apktool - A tool for reverse engineering Android apk files
  • [7] Ardimento M. L., 2020, P IEEE C EV AD INT S, P1
  • [8] Drebin: Effective and Explainable Detection of Android Malware in Your Pocket
    Arp, Daniel
    Spreitzenbarth, Michael
    Huebner, Malte
    Gascon, Hugo
    Rieck, Konrad
    [J]. 21ST ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2014), 2014,
  • [9] Control flow obfuscation for Android applications
    Balachandran, Vivek
    Sufatrio
    Tan, Darell J. J.
    Thing, Vrizlynn L. L.
    [J]. COMPUTERS & SECURITY, 2016, 61 : 72 - 93
  • [10] Bernardi ML, 2016, INT SYMPOS COMPUT NE, P616, DOI [10.1109/CANDAR.2016.0111, 10.1109/CANDAR.2016.111]