SDIF-CNN: Stacking deep image features using fine-tuned convolution neural network models for real-world malware detection and classification

被引：9

作者：

Kumar, Sanjeev ^{[1
]}

Panda, Kajal ^{[1
]}

机构：

[1] Ctr Dev Adv Comp C DAC, Cyber Secur Technol Div CSTD, Mohali, India

来源：

APPLIED SOFT COMPUTING | 2023年 / 146卷

关键词：

Malware detection; Machine learning; Convolutional neural networks; Deep learning; Cybersecurity; VISUALIZATION;

D O I：

10.1016/j.asoc.2023.110676

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The detection of malware is a complex problem in the area of Internet security. Developing a malware defense system that is less costly to detect large-scale malware is needed. This paper proposes a novel malware detection and classification architecture based on image visualization as SDIF-CNN: Stacking deep image features using fine-tuned convolution neural networks. The hybrid methodology of transfer learning as fine-tuning and feature extractor of deep convolution neural network models is designed. At first, the pre-trained VGG16 CNN model is deeply fine-tuned with different hyperparameters, including the number of layers, learning rate, momentum, etc. The transfer learning-based fine-tuned VGG16 model is used as a feature extractor along with the three similar pre-trained CNN models, VGG19, ResNet50, and InceptionV3, to obtain the diverse feature map. The extracted features are horizontally concatenated to construct a single feature map. The different feature selection methodologies, including filter-based methods and embedded methods, such as linear regression and random forest, are designed to discard the irrelevant features from a stacked feature map. After that, this study uses six machine learning and deep learning classifiers- K-Nearest Neighbor (K-NN), Support Vector Machine (SVM), Random Forest (RF), Multi-Layer Perceptron (MLP), Extra Tree (ET), and Gaussian Naive Bayes (GNB) by using the stacked feature map as a training feature vector. The hyperparameter optimization of the MLP model as the best classifier is performed using a randomized search algorithm to devise an optimal classifier. The experiments are performed using a publicly benchmarked MalImg dataset of 9339 images from 25 families. The model is also validated on real-world and packed malicious programs to prove the generalization of the proposed methodology in detecting real-world malware. In the proposed system, the MLP model obtained the best performance results as 98.55% accuracy, 99% precision, 99% recall, and 99% F1-score for MalImg datasets, and accuracy of 94.78% for real-world malware datasets. The proposed methodology is resilient to commonly used obfuscation techniques and does not depend upon code disassembly, reverse engineering analysis, and highly resource-intensive dynamic analysis. & COPY; 2023 Elsevier B.V. All rights reserved.

引用

页数：19

共 57 条

[1] Behavior-based ransomware classification: A particle swarm optimization wrapper-based approach for feature selection
Abbasi, Muhammad Shabbir
Al-Sahaf, Harith
Mansoori, Masood
Welch, Ian
[J]. APPLIED SOFT COMPUTING, 2022, 121
[2] Ataraj L, 2011, P 8 INT S VIS CYB SE, P1, DOI DOI 10.1145/2016904.2016908
[3] Mining Apps for Abnormal Usage of Sensitive Data
Avdiienko, Vitalii
Kuznetsov, Konstantin
Gorla, Alessandra
Zeller, Andreas
Arzt, Steven
Rasthofer, Siegfried
Bodden, Eric
[J]. 2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 1, 2015, : 426 - 436
[4] Image-Based Malware Classification Using VGG19 Network and Spatial Convolutional Attention
Awan, Mazhar Javed
Masood, Osama Ahmed
Mohammed, Mazin Abed
Yasin, Awais
Zain, Azlan Mohd
Damasevicius, Robertas
Abdulkareem, Karrar Hameed
[J]. ELECTRONICS, 2021, 10 (19)
[5] Bhodia N., 2019, P 5 INT C INF SYST
[6] Image-based malware representation approach with EfficientNet convolutional neural networks for effective malware classification
Chaganti, Rajasekhar
Ravi, Vinayakumar
Pham, Tuan D.
[J]. JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2022, 69
[7] cisco, 2020, Cisco Annual Internet Report
[8] Malicious code detection based on CNNs and multi-objective algorithm
Cui, Zhihua
Du, Lei
Wang, Penghong
Cai, Xingjuan
Zhang, Wensheng
[J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 129 : 50 - 58
[9] Detection of Malicious Code Variants Based on Deep Learning
Cui, Zhihua
Xue, Fei
Cai, Xingjuan
Cao, Yang
Wang, Gai-ge
Chen, Jinjun
[J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2018, 14 (07) : 3187 - 3196
[10] DroidScribe: Classifying Android Malware Based on Runtime Behavior
Dash, Santanu Kumar
Suarez-Tangil, Guillermo
Khan, Salahuddin
Tam, Kimberly
Ahmadi, Mansour
Kinder, Johannes
Cavallaro, Lorenzo
[J]. 2016 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2016), 2016, : 252 - 261

← 1 2 3 4 5 6 →