Android Malware Family Classification and Characterization Using CFG and DFG

被引:0
作者
Xu, Zhiwu [1 ,2 ]
Ren, Kerong [1 ]
Song, Fu [3 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China
[2] Shenzhen Univ, Natl Engn Lab Big Data Syst Comp Technol, Shenzhen, Peoples R China
[3] ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
来源
2019 13TH INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF SOFTWARE ENGINEERING (TASE 2019) | 2019年
基金
中国国家自然科学基金;
关键词
Android malware; malware family; malware characterization; static analysis; deep learning;
D O I
10.1109/TASE.2019.00014
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Android malware has become a serious threat for our daily life, and thus there is a pressing need to effectively mitigate or defend against them. Recently, many approaches and tools to analyze Android malware have been proposed to protect legitimate users from the threat. However, most approaches focus on malware detection, while only a few of them consider malware classification or malware characterization. In this paper, we propose an extension of CDGDroid to classifying and characterizing Android malware families automatically. We first perform static analysis used in CDGDroid to extract control-flow graphs and data-flow graphs on the instruction level. Then we encode the graphs into matrices, and use them to build the family classification models via deep learning. For family characterization, we extract the n-gram sequences from the graphs, which are filtered according to the weights of the classification model built for the target family. And then we construct a vector space model and select the top-k sequences as a characterization of the target family. We have conducted some experiments to evaluate our approach and have identified that the family classification model taking the horizontal combination of CFG and DFG as features offers the best performance in terms of accuracy among all the models. Compared with CDGDroid, Drebin and many antivirus tools gathered in VirusTotal, our family classification model gives a better performance. Finally, We have also conducted experiments on family characterization, and the experimental results have shown that our characterization can capture the malicious behaviors of the testing families.
引用
收藏
页码:49 / 56
页数:8
相关论文
共 21 条
  • [1] Aafer Yousra, 2013, SECURECOMM
  • [2] Empirical assessment of machine learning-based malware detectors for Android Measuring the gap between in-the-lab and in-the-wild validation scenarios
    Allix, Kevin
    Bissyande, Tegawende F.
    Jerome, Quentin
    Klein, Jacques
    State, Radu
    Le Traon, Yves
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2016, 21 (01) : 183 - 211
  • [3] [Anonymous], 2015, Security and Privacy in Communication Networks-11th International Conference, SecureComm 2015, Dallas, TX, USA, October 26-29, 2015, Revised Selected Papers, volume 164 of SecureComm' 15
  • [4] Arp Daniel, 2014, NDSS 14
  • [5] AU KWY, 2012, CCS
  • [6] Change detection via affine and quadratic detectors
    Cao, Yang
    Nemirovski, Arkadi
    Xie, Yao
    Guigues, Vincent
    Juditsky, Anatoli
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2018, 12 (01): : 1 - 57
  • [7] Feng Y., 2017, NDSS
  • [8] Characterization of Malware Detection on Android Application
    Hein, Chit La Pyae Myo
    Myo, Khin Mar
    [J]. GENETIC AND EVOLUTIONARY COMPUTING, VOL I, 2016, 387 : 113 - 124
  • [9] Kang S, 2015, INT J DISASTER RECOV, V6, P1
  • [10] Killam P. C. R., 2016, WORKSH TA COS