NMal-Droid: network-based android malware detection system using transfer learning and CNN-BiGRU ensemble

被引:5
作者
Ullah, Farhan [1 ]
Ullah, Shamsher [2 ]
Srivastava, Gautam [3 ,5 ,6 ]
Lin, Jerry Chun-Wei [4 ]
Zhao, Yue [1 ]
机构
[1] Northwestern Polytech Univ, Sch Software, Xian 710072, Shanxi, Peoples R China
[2] Shenzhen Univ, Sch Comp Sci & Software Engn, Shenzhen 518000, Peoples R China
[3] Brandon Univ, Dept Math & Comp Sci, Brandon, MB R7A 6A9, Canada
[4] Western Norway Univ Appl Sci, Dept Comp Sci Elect Engn & Math Sci, N-5063 Bergen, Norway
[5] China Med Univ, Res Ctr Interneural Comp, Taichung 40402, Taiwan
[6] Lebanese Amer Univ, Dept Comp Sci & Math, Beirut 1102, Lebanon
基金
加拿大自然科学与工程研究理事会;
关键词
Network traffic; Malware classification; Transfer learning; Explainable AI; Cybersecurity;
D O I
10.1007/s11276-023-03414-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Currently, malware activities pose a substantial risk to the security of Android applications. These risks are capable of stealing important information and causing chaos in the economy, social structure, and financial sector. Malicious network traffic targets Android applications due to their constant connectivity. This study develops the NMal-Droid approach for network-based Android malware detection and classification. First, we designed a packet parser algorithm that filters the combination of HTTP traces and TCP flows from PCAPs (Packet Capturing) files. Second, the fine-tune embedding approach is developed that uses a word2vec pre-trained model to analyze features' embeddings in three different ways, i.e., random, static, and dynamic. It is used to learn and extract feature-matrix matrices with related meanings. Third, The Convolutional Neural Network (CNN) is used to extract effective features from embedded information. Fourth, the Bi-directional Gated Recurrent Unit (Bi-GRU) neural network is designed to compute gradient computation in the context of time-forward and time-reversed. Finally, a multi-head ensemble of CNN-BiGRU is developed for accurate malware classification and detection. The proposed approach is evaluated on five different activation functions with 100 filters and a range of 1-5 kernel sizes for in-depth investigation. An explainable AI-based experiment is conducted to interpret and validate the proposed approach. The proposed method is tested using two big Android malware datasets, CIC-AAGM2017 and CICMalDroid 2020, which comprise a total of 10.2k malware and 3.2K benign samples. It is shown that the proposed approach outperforms as compared to the state-of-the-art methods.
引用
收藏
页码:6177 / 6198
页数:22
相关论文
共 45 条
  • [1] Aresu M, 2015, 2015 10TH INTERNATIONAL CONFERENCE ON MALICIOUS AND UNWANTED SOFTWARE (MALWARE), P128, DOI 10.1109/MALWARE.2015.7413693
  • [2] Arshad S, 2016, INT J ADV COMPUT SC, V7, P463
  • [3] N-Gram, Semantic-Based Neural Network for Mobile Malware Network Traffic Detection
    Bai, Huiwen
    Liu, Guangjie
    Liu, Weiwei
    Quan, Yingxue
    Huang, Shuhua
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [4] A Survey of Deep Learning Methods for Cyber Security
    Berman, Daniel S.
    Buczak, Anna L.
    Chavis, Jeffrey S.
    Corbett, Cherita L.
    [J]. INFORMATION, 2019, 10 (04)
  • [5] Flexible neural trees based early stage identification for IP traffic
    Chen, Zhenxiang
    Peng, Lizhi
    Gao, Chongzhi
    Yang, Bo
    Chen, Yuehui
    Li, Jin
    [J]. SOFT COMPUTING, 2017, 21 (08) : 2035 - 2046
  • [6] Chung J., 2014, ARXIV
  • [7] David OE, 2015, IEEE IJCNN
  • [8] A Survey on Automated Dynamic Malware-Analysis Techniques and Tools
    Egele, Manuel
    Scholte, Theodoor
    Kirda, Engin
    Kruegel, Christopher
    [J]. ACM COMPUTING SURVEYS, 2012, 44 (02)
  • [9] Android Security: A Survey of Issues, Malware Penetration, and Defenses
    Faruki, Parvez
    Bharmal, Ammar
    Laxmi, Vijay
    Ganmoor, Vijay
    Gaur, Manoj Singh
    Conti, Mauro
    Rajarajan, Muttukrishnan
    [J]. IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2015, 17 (02): : 998 - 1022
  • [10] Felt A. P., 2011, PROC 1 ACM WORKSHOP, P3, DOI DOI 10.1145/2046614.2046618