Hybrid sequence-based Android malware detection using natural language processing

被引:45
作者
Zhang, Nan [1 ]
Xue, Jingfeng [1 ]
Ma, Yuxi [1 ]
Zhang, Ruyun [2 ]
Liang, Tiancai [3 ]
Tan, Yu-an [4 ]
机构
[1] Beijing Inst Technol, Sch Comp, Beijing, Peoples R China
[2] Zhejiang Lab, Hangzhou, Zhejiang, Peoples R China
[3] GRG Banking Equipment Co Ltd, Guangzhou 510145, Peoples R China
[4] Beijing Inst Technol, Sch Cyberspace Sci & Technol, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Android malware detection; attention; deep learning; hybrid analysis; machine learning; natural language processing; text classification;
D O I
10.1002/int.22529
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Android platform has been the target of attackers due to its openness and increasing popularity. Android malware has explosively increased in recent years, which poses serious threats to Android security. Thus proposing efficient Android malware detection methods is curial in defeating malware. Various features extracted from static or dynamic analysis using machine learning have played an important role in malware detection recently. However, existing code obfuscation, code encryption, and dynamic code loading techniques can be employed to hinder systems that single based on static analysis, purely dynamic analysis systems cannot detect all potential code execution paths. To address these issues, we propose CoDroid, a sequence-based hybrid Android malware detection method, which utilizes the sequences of static opcode and dynamic system call. We treat one sequence as a sentence in the natural language processing and construct a CNN-BiLSTM-Attention classifier which consists of Convolutional Neural Networks (CNNs), the Bidirectional Long Short-Term Memory (BiLSTM) with an attention language model. We extensively evaluate CoDroid under a real-world data set and perform comprehensive analysis against other existing related detection methods. The evaluations show the effectiveness and flexibility of CoDroid across a variety of experimental settings.
引用
收藏
页码:5770 / 5784
页数:15
相关论文
共 49 条
  • [11] A study of run-time behavioral evolution of benign versus malicious apps in android
    Cai, Haipeng
    Fu, Xiaoqin
    Hamou-Lhadj, Abdelwahab
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2020, 122
  • [12] Effectiveness of Opcode ngrams for Detection of Multi Family Android Malware
    Canfora, Gerardo
    De Lorenzo, Andrea
    Medvet, Eric
    Mercaldo, Francesco
    Visaggio, Corrado Aaron
    [J]. PROCEEDINGS 10TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY ARES 2015, 2015, : 333 - 340
  • [13] Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach
    Chen, Sen
    Xue, Minhui
    Fan, Lingling
    Hao, Shuang
    Xu, Lihua
    Zhu, Haojin
    Li, Bo
    [J]. COMPUTERS & SECURITY, 2018, 73 : 326 - 344
  • [14] StormDroid: A Streaminglized Machine Learning-Based System for Detecting Android Malware
    Chen, Sen
    Xue, Minhui
    Tang, Zhushou
    Xu, Lihua
    Zhu, Haojin
    [J]. ASIA CCS'16: PROCEEDINGS OF THE 11TH ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, : 377 - 388
  • [15] Android HIV: A Study of Repackaging Malware for Evading Machine-Learning Detection
    Chen, Xiao
    Li, Chaoran
    Wang, Derui
    Wen, Sheng
    Zhang, Jun
    Nepal, Surya
    Xiang, Yang
    Ren, Kui
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2020, 15 : 987 - 1001
  • [16] Das PK, 2017, IEEE CONF COMPUT, P487, DOI 10.1109/INFCOMW.2017.8116425
  • [17] Development of Android malware worldwide 2016-2020, STAT
  • [18] TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones
    Enck, William
    Gilbert, Peter
    Han, Seungyeop
    Tendulkar, Vasant
    Chun, Byung-Gon
    Cox, Landon P.
    Jung, Jaeyeon
    McDaniel, Patrick
    Sheth, Anmol N.
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2014, 32 (02):
  • [19] Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
  • [20] Jerome Q, 2014, IEEE ICC, P914, DOI 10.1109/ICC.2014.6883436