Hybrid sequence-based Android malware detection using natural language processing

被引:45
作者
Zhang, Nan [1 ]
Xue, Jingfeng [1 ]
Ma, Yuxi [1 ]
Zhang, Ruyun [2 ]
Liang, Tiancai [3 ]
Tan, Yu-an [4 ]
机构
[1] Beijing Inst Technol, Sch Comp, Beijing, Peoples R China
[2] Zhejiang Lab, Hangzhou, Zhejiang, Peoples R China
[3] GRG Banking Equipment Co Ltd, Guangzhou 510145, Peoples R China
[4] Beijing Inst Technol, Sch Cyberspace Sci & Technol, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Android malware detection; attention; deep learning; hybrid analysis; machine learning; natural language processing; text classification;
D O I
10.1002/int.22529
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Android platform has been the target of attackers due to its openness and increasing popularity. Android malware has explosively increased in recent years, which poses serious threats to Android security. Thus proposing efficient Android malware detection methods is curial in defeating malware. Various features extracted from static or dynamic analysis using machine learning have played an important role in malware detection recently. However, existing code obfuscation, code encryption, and dynamic code loading techniques can be employed to hinder systems that single based on static analysis, purely dynamic analysis systems cannot detect all potential code execution paths. To address these issues, we propose CoDroid, a sequence-based hybrid Android malware detection method, which utilizes the sequences of static opcode and dynamic system call. We treat one sequence as a sentence in the natural language processing and construct a CNN-BiLSTM-Attention classifier which consists of Convolutional Neural Networks (CNNs), the Bidirectional Long Short-Term Memory (BiLSTM) with an attention language model. We extensively evaluate CoDroid under a real-world data set and perform comprehensive analysis against other existing related detection methods. The evaluations show the effectiveness and flexibility of CoDroid across a variety of experimental settings.
引用
收藏
页码:5770 / 5784
页数:15
相关论文
共 49 条
  • [1] Aafer Y, 2013, L N INST COMP SCI SO, V127, P86
  • [2] Althelaya KA., 2020, SECURITY COMPUTING C, P309
  • [3] DL-Droid: Deep learning based android malware detection using real devices
    Alzaylaee, Mohammed K.
    Yerima, Suleiman Y.
    Sezer, Sakir
    [J]. COMPUTERS & SECURITY, 2020, 89
  • [4] Static malware detection and attribution in android byte-code through an end-to-end deep system
    Amin, Muhammad
    Tanveer, Tamleek Ali
    Tehseen, Mohammad
    Khan, Murad
    Khan, Fakhri Alam
    Anwar, Sajid
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 102 : 112 - 126
  • [5] [Anonymous], 2014, P ACM INT C MEASUREM
  • [6] Drebin: Effective and Explainable Detection of Android Malware in Your Pocket
    Arp, Daniel
    Spreitzenbarth, Michael
    Huebner, Malte
    Gascon, Hugo
    Rieck, Konrad
    [J]. 21ST ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2014), 2014,
  • [7] Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
  • [8] SWORD: Semantic aWare andrOid malwaRe Detector
    Bhandari, Shweta
    Panihar, Rekha
    Naval, Smita
    Laxmi, Vijay
    Zemmari, Akka
    Gaur, Manoj Singh
    [J]. JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2018, 42 : 46 - 56
  • [9] Blasing Thomas, 2010, 2010 5th International Conference on Malicious and Unwanted Software (MALWARE 2010), P55, DOI 10.1109/MALWARE.2010.5665792
  • [10] Burguera I., 2011, P 1 ACM WORKSH SEC P, P15, DOI DOI 10.1145/2046614.2046619