Deep Hashing for Malware Family Classification and New Malware Identification

被引:2
|
作者
Zhang, Yunchun [1 ]
Liao, Zikun [1 ]
Zhang, Ning [1 ]
Min, Shaohui [1 ]
Wang, Qi [1 ]
Quek, Tony Q. S. [2 ]
Zhao, Mingxiong [1 ]
机构
[1] Yunnan Univ, Engn Res Ctr Cyberspace, Natl Pilot Sch Software, Kunming 650500, Peoples R China
[2] Singapore Univ Technol & Design, Informat Syst Technol & Design, Singapore 487372, Singapore
来源
IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 16期
基金
中国国家自然科学基金;
关键词
Malware; Feature extraction; Image retrieval; Image classification; Artificial neural networks; Internet of Things; Semantics; Deep hashing; deep neural networks (DNNs); image retrieval; malware classification; malware images; SEMANTICS; NETWORK;
D O I
10.1109/JIOT.2024.3353250
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although numerous state-of-the-art deep neural networks have recently been proposed for malware classification, effectively detecting malware on a large-scale sample set and identifying zero-day or new malware variants still pose significant challenges. To address this issue, a deep hashing-based malware classification model is designed for malware identification, including two parts: 1) ResNet50-based deep hashing for malware retrieval and 2) voting-based malware classification. Specifically, multiple deep hashing models are developed by extracting the high-layer outputs (feature maps) from the ResNet50 trained with malware gray-scale images in the first part. In this case, to maximize the Hamming distance or dissimilarity among hash values computed with malware samples under different families, a ResNet50-based deep polarized network (RNDPN) is designed to return Top K similar samples. In the second part, we propose a majority-voting and a Hamming-distance-based voting for malware identification according to the retrieved results. The experiment results show that RNDPN outperforms the other six deep hashing models with 97.54% mean average precision (mAP) for malware retrieval when only 40 similar examples are retrieved, where the best results for all deep hashing models are observed with 48-bits hashing code length. Furthermore, the Hamming distance-based voting method implemented with RNDPN demonstrates unparalleled performance in malware classification compared to other models. Notably, it achieves exceptional results in two key aspects: 1) malware classification accuracy with an impressive accuracy rate of 96.5% and 2) the identification of new or zero-day malware with a commendable accuracy of 85.7%.
引用
收藏
页码:26837 / 26851
页数:15
相关论文
共 50 条
  • [31] DTMIC: Deep transfer learning for malware image classification
    Kumar, Sanjeev
    Janet, B.
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2022, 64
  • [32] HYDRA: A multimodal deep learning framework for malware classification
    Gibert, Daniel
    Mateu, Carles
    Planes, Jordi
    COMPUTERS & SECURITY, 2020, 95
  • [33] Deep Learning Applied to Imbalanced Malware Datasets Classification
    Salas, Marcelo Palma
    de Geus, Paulo Licio
    JOURNAL OF INTERNET SERVICES AND APPLICATIONS, 2024, 15 (01) : 342 - 359
  • [34] RMDNet-Deep Learning Paradigms for Effective Malware Detection and Classification
    Puneeth, S.
    Lal, Shyam
    Pratap Singh, Mahendra
    Raghavendra, B. S.
    IEEE ACCESS, 2024, 12 : 82622 - 82635
  • [35] Hybrid Android Malware Detection and Classification Using Deep Neural Networks
    Rashid, Muhammad Umar
    Qureshi, Shahnawaz
    Abid, Abdullah
    Alqahtany, Saad Said
    Alqazzaz, Ali
    Hassan, Mahmood ul
    Reshan, Mana Saleh Al
    Shaikh, Asadullah
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2025, 18 (01)
  • [36] MVIIDroid: A Multiple View Information Integration Approach for Android Malware Detection and Family Identification
    Wu, Qing
    Li, Miaomiao
    Zhu, Xueling
    Liu, Bo
    IEEE MULTIMEDIA, 2020, 27 (04) : 48 - 57
  • [37] Deep Learning Model with Sequential Features for Malware Classification
    Wu, Xuan
    Song, Yafei
    Hou, Xiaoyi
    Ma, Zexuan
    Chen, Chen
    APPLIED SCIENCES-BASEL, 2022, 12 (19):
  • [38] Fusing feature engineering and deep learning: A case study for malware classification
    Gibert, Daniel
    Planes, Jordi
    Mateu, Carles
    Le, Quan
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 207
  • [39] Deep Feature Extraction and Classification of Android Malware Images
    Singh, Jaiteg
    Thakur, Deepak
    Ali, Farman
    Gera, Tanya
    Kwak, Kyung Sup
    SENSORS, 2020, 20 (24) : 1 - 29
  • [40] A few-shot malware classification approach for unknown family recognition using malware feature visualization
    Conti, Mauro
    Khandhar, Shubham
    Vinod, P.
    COMPUTERS & SECURITY, 2022, 122