Deep Hashing for Malware Family Classification and New Malware Identification

被引：2

作者：

Zhang, Yunchun ^{[1
]}

Liao, Zikun ^{[1
]}

Zhang, Ning ^{[1
]}

Min, Shaohui ^{[1
]}

Wang, Qi ^{[1
]}

Quek, Tony Q. S. ^{[2
]}

Zhao, Mingxiong ^{[1
]}

机构：

[1] Yunnan Univ, Engn Res Ctr Cyberspace, Natl Pilot Sch Software, Kunming 650500, Peoples R China

[2] Singapore Univ Technol & Design, Informat Syst Technol & Design, Singapore 487372, Singapore

来源：

IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 16期

基金：

中国国家自然科学基金;

关键词：

Malware; Feature extraction; Image retrieval; Image classification; Artificial neural networks; Internet of Things; Semantics; Deep hashing; deep neural networks (DNNs); image retrieval; malware classification; malware images; SEMANTICS; NETWORK;

D O I：

10.1109/JIOT.2024.3353250

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Although numerous state-of-the-art deep neural networks have recently been proposed for malware classification, effectively detecting malware on a large-scale sample set and identifying zero-day or new malware variants still pose significant challenges. To address this issue, a deep hashing-based malware classification model is designed for malware identification, including two parts: 1) ResNet50-based deep hashing for malware retrieval and 2) voting-based malware classification. Specifically, multiple deep hashing models are developed by extracting the high-layer outputs (feature maps) from the ResNet50 trained with malware gray-scale images in the first part. In this case, to maximize the Hamming distance or dissimilarity among hash values computed with malware samples under different families, a ResNet50-based deep polarized network (RNDPN) is designed to return Top K similar samples. In the second part, we propose a majority-voting and a Hamming-distance-based voting for malware identification according to the retrieved results. The experiment results show that RNDPN outperforms the other six deep hashing models with 97.54% mean average precision (mAP) for malware retrieval when only 40 similar examples are retrieved, where the best results for all deep hashing models are observed with 48-bits hashing code length. Furthermore, the Hamming distance-based voting method implemented with RNDPN demonstrates unparalleled performance in malware classification compared to other models. Notably, it achieves exceptional results in two key aspects: 1) malware classification accuracy with an impressive accuracy rate of 96.5% and 2) the identification of new or zero-day malware with a commendable accuracy of 85.7%.

引用

页码：26837 / 26851

页数：15

共 50 条

[1] A New Malware Classification Framework Based on Deep Learning Algorithms
Aslan, Omer
Yilmaz, Abdullah Asim
IEEE ACCESS, 2021, 9 : 87936 - 87951
[2] A Multi-Dimensional Deep Learning Framework for IoT Malware Classification and Family Attribution
Dib, Mirabelle
Torabi, Sadegh
Bou-Harb, Elias
Assi, Chadi
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2021, 18 (02): : 1165 - 1177
[3] Automatic malware classification and new malware detection using machine learning
Liu Liu
Bao-sheng Wang
Bo Yu
Qiu-xi Zhong
Frontiers of Information Technology & Electronic Engineering, 2017, 18 : 1336 - 1347
[4] Automatic malware classification and new malware detection using machine learning
Liu, Liu
Wang, Bao-sheng
Yu, Bo
Zhong, Qiu-xi
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2017, 18 (09) : 1336 - 1347
[5] Android Malware Classification Based on Fuzzy Hashing Visualization
Rodriguez-Bazan, Horacio
Sidorov, Grigori
Escamilla-Ambrosio, Ponciano Jorge
MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2023, 5 (04): : 1826 - 1847
[6] Texture-Based Malware Family Classification
Kumar, Nitish
Meenpal, Toshanlal
2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,
[7] A New Malware Classification Approach Based on Malware Dynamic Analysis
Fang, Ying
Yu, Bo
Tang, Yong
Liu, Liu
Lu, Zexin
Wang, Yi
Yang, Qiang
INFORMATION SECURITY AND PRIVACY, ACISP 2017, PT II, 2017, 10343 : 173 - 189
[8] Malware-on-the-Brain: Illuminating Malware Byte Codes With Images for Malware Classification
Zhong, Fangtian
Chen, Zekai
Xu, Minghui
Zhang, Guoming
Yu, Dongxiao
Cheng, Xiuzhen
IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (02) : 438 - 451
[9] A Deep Learning Framework for Malware Classification
Kalash, Mahmoud
Rochan, Mrigank
Mohammed, Noman
Bruce, Neil
Wang, Yang
Iqbal, Farkhund
INTERNATIONAL JOURNAL OF DIGITAL CRIME AND FORENSICS, 2020, 12 (01) : 90 - 108
[10] DATA AUGMENTATION IN TRAINING DEEP LEARNING MODELS FOR MALWARE FAMILY CLASSIFICATION
Ding Yuxin
Wang Guangbin
Ma Yubin
Ding Haoxuan
PROCEEDINGS OF 2021 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2021, : 102 - 107

← 1 2 3 4 5 →