Deep Hashing for Malware Family Classification and New Malware Identification

被引:2
|
作者
Zhang, Yunchun [1 ]
Liao, Zikun [1 ]
Zhang, Ning [1 ]
Min, Shaohui [1 ]
Wang, Qi [1 ]
Quek, Tony Q. S. [2 ]
Zhao, Mingxiong [1 ]
机构
[1] Yunnan Univ, Engn Res Ctr Cyberspace, Natl Pilot Sch Software, Kunming 650500, Peoples R China
[2] Singapore Univ Technol & Design, Informat Syst Technol & Design, Singapore 487372, Singapore
来源
IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 16期
基金
中国国家自然科学基金;
关键词
Malware; Feature extraction; Image retrieval; Image classification; Artificial neural networks; Internet of Things; Semantics; Deep hashing; deep neural networks (DNNs); image retrieval; malware classification; malware images; SEMANTICS; NETWORK;
D O I
10.1109/JIOT.2024.3353250
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although numerous state-of-the-art deep neural networks have recently been proposed for malware classification, effectively detecting malware on a large-scale sample set and identifying zero-day or new malware variants still pose significant challenges. To address this issue, a deep hashing-based malware classification model is designed for malware identification, including two parts: 1) ResNet50-based deep hashing for malware retrieval and 2) voting-based malware classification. Specifically, multiple deep hashing models are developed by extracting the high-layer outputs (feature maps) from the ResNet50 trained with malware gray-scale images in the first part. In this case, to maximize the Hamming distance or dissimilarity among hash values computed with malware samples under different families, a ResNet50-based deep polarized network (RNDPN) is designed to return Top K similar samples. In the second part, we propose a majority-voting and a Hamming-distance-based voting for malware identification according to the retrieved results. The experiment results show that RNDPN outperforms the other six deep hashing models with 97.54% mean average precision (mAP) for malware retrieval when only 40 similar examples are retrieved, where the best results for all deep hashing models are observed with 48-bits hashing code length. Furthermore, the Hamming distance-based voting method implemented with RNDPN demonstrates unparalleled performance in malware classification compared to other models. Notably, it achieves exceptional results in two key aspects: 1) malware classification accuracy with an impressive accuracy rate of 96.5% and 2) the identification of new or zero-day malware with a commendable accuracy of 85.7%.
引用
收藏
页码:26837 / 26851
页数:15
相关论文
共 50 条
  • [41] A New Method for Malware Classification Using Hyperspheres
    Trang, Nguyen Thi Thu
    Tho, Nguyen Dai
    Kien, Hoang Dang
    2023 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP, SSP, 2023, : 596 - 600
  • [42] EntropyVis: Malware Classification
    Ren, Zhuojun
    Chen, Guang
    2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,
  • [43] Unsupervised Behavioural Mining and Clustering for Malware Family Identification
    Khanh Huu The Dam
    Given-Wilson, Thomas
    Legay, Axel
    36TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2021, 2021, : 374 - 383
  • [44] Malware identification using visualization images and deep learning
    Ni, Sang
    Qian, Quan
    Zhang, Rui
    COMPUTERS & SECURITY, 2018, 77 : 871 - 885
  • [45] The Classification and Detection of Malware Using Soft Relevance Evaluation
    Zhang, Yongchao
    Liu, Zhe
    Jiang, Yu
    IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 309 - 320
  • [46] A Malware Detection Approach Using Autoencoder in Deep Learning
    Xing, Xiaofei
    Jin, Xiang
    Elahi, Haroon
    Jiang, Hai
    Wang, Guojun
    IEEE ACCESS, 2022, 10 : 25696 - 25706
  • [47] ANDROIDGYNY: Reviewing Clustering Techniques for Android Malware Family Classification
    Rodrigues Pimenta, Thalita Scharr
    Ceschin, Fabricio
    Gregio, Andre
    DIGITAL THREATS: RESEARCH AND PRACTICE, 2024, 5 (01):
  • [48] Formal Equivalence Checking for Mobile Malware Detection and Family Classification
    Mercaldo, Francesco
    Santone, Antonella
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2021, 48 (07) : 2643 - 2657
  • [49] An Approach for Detection and Family Classification of Malware Based on Behavioral Analysis
    Hansen, Steven Strandlund
    Larsen, Thor Mark Tampus
    Stevanovic, Matija
    Pedersen, Jens Myrup
    2016 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS (ICNC), 2016,
  • [50] Robust Malware Family Classification Using Effective Features and Classifiers
    Hammad, Baraa Tareq
    Jamil, Norziana
    Ahmed, Ismail Taha
    Zain, Zuhaira Muhammad
    Basheer, Shakila
    APPLIED SCIENCES-BASEL, 2022, 12 (15):