Design of an Iterative Method for Malware Detection Using Autoencoders and Hybrid Machine Learning Models

被引:0
|
作者
Beg, Rijvan [1 ]
Pateriya, R. K. [1 ]
Tomar, Deepak Singh [1 ]
机构
[1] Maulana Azad Natl Inst Technol, Comp Sci & Engn Dept, Bhopal 462003, Madhya Pradesh, India
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Malware; Feature extraction; Training; Machine learning; Convolutional neural networks; Data models; Robustness; Deep learning; Adaptation models; Accuracy; Autoencoders; gradient boosted decision trees; adversarial training; malware analysis; machine learning techniques; DETECTION SYSTEM;
D O I
10.1109/ACCESS.2024.3491185
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the evolving cyber threat landscape, one of the most visible and pernicious challenges is malware activity detection and analysis. Traditional detection and analysis methods face threats of data high-dimensionality, lack of strength against adversarial attacks, and non-efficient use of unlabeled data samples. In this context, we propose a comprehensive framework that applies machine learning methods to enhance evidence collection and malware activity analysis. The approach of our proposed model innovatively uses several advanced machine learning methods. First, in order to reduce the dimensionality of raw malware activity data by 50%, while at the same timestamp preserving critical information, as evidenced by minimal reconstruction error, we apply an autoencoder-based feature learning technique. This technique assists in the extraction of compact, informative, and feature representations covering both global and local discriminative patterns for accurate malware detection. With the addition of Gradient Boosted Decision Trees (GBDT) to features derived from Convolutional Neural Networks (CNN), we further improve the capability of the model. The hybrid model combines the outlier robustness and heterogeneous data handling capability of GBDTs with the hierarchical feature extraction capability of CNNs, resulting in a significant improvement in performance, with an F1-score of 0.95 on a validation set. In order to defend from evasion attacks, we incorporate adversarial training using Generative Adversarial Networks (GANs). It enables effective counteraction against adversarial strategies, reducing adversarial success rates by 60%. The model is trained using adversarial examples, and its parameters are optimized to minimize classification loss across both the normal and distorted inputs, thereby enhancing robustness. Expanding the applicability of the framework, we use semi-supervised self-training using Variational Autoencoders (VAEs) to use both labeled and unlabeled datasets & samples. This approach not only improves anomaly detection by 30% but also allows the model to learn probabilistic latent representations, thereby revealing underlying data structures. Finally, we address the challenge of temporal malware activity analysis through Long Short-Term Memory (LSTM) networks augmented with an attention mechanism. This configuration allows the model to be able to detect and adapt to evolving attack patterns, thus, by 25%, significantly improving the zero-day attack detection.
引用
收藏
页码:175032 / 175055
页数:24
相关论文
共 50 条
  • [1] Detection of Android Malware Using Machine Learning and Siamese Shot Learning Technique for Security
    Almarshad, Fahdah A.
    Zakariah, Mohammed
    Gashgari, Ghada Abdalaziz
    Aldakheel, Eman Abdullah
    Alzahrani, Abdullah I. A.
    IEEE ACCESS, 2023, 11 : 127697 - 127714
  • [2] A Malware Detection Approach Using Autoencoder in Deep Learning
    Xing, Xiaofei
    Jin, Xiang
    Elahi, Haroon
    Jiang, Hai
    Wang, Guojun
    IEEE ACCESS, 2022, 10 : 25696 - 25706
  • [3] Design of an Iterative Method for Time Series Forecasting Using Temporal Attention and Hybrid Deep Learning Architectures
    Boddu, Yuvaraja
    Manimaran, A.
    IEEE ACCESS, 2025, 13 : 25683 - 25703
  • [4] Malware Detection Using Machine Learning
    Kumar, Ajay
    Abhishek, Kumar
    Shah, Kunjal
    Patel, Divy
    Jain, Yash
    Chheda, Harsh
    Nerurka, Pranav
    KNOWLEDGE GRAPHS AND SEMANTIC WEB, KGSWC 2020, 2020, 1232 : 61 - 71
  • [5] An Effective Malware Detection Method Using Hybrid Feature Selection and Machine Learning Algorithms
    Namita Dabas
    Prachi Ahlawat
    Prabha Sharma
    Arabian Journal for Science and Engineering, 2023, 48 : 9749 - 9767
  • [6] An Effective Malware Detection Method Using Hybrid Feature Selection and Machine Learning Algorithms
    Dabas, Namita
    Ahlawat, Prachi
    Sharma, Prabha
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (08) : 9749 - 9767
  • [7] Improving Machine Learning Models for Malware Detection Using Embedded Feature Selection Method
    Chemmakha, Mohammed
    Habibi, Omar
    Lazaar, Mohamed
    IFAC PAPERSONLINE, 2022, 55 (12): : 771 - 776
  • [8] Android Malware Detection Using Machine Learning
    Droos, Ayat
    Al-Mahadeen, Awss
    Al-Harasis, Tasnim
    Al-Attar, Rama
    Ababneh, Mohammad
    2022 13TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2022, : 36 - 41
  • [9] Advanced Machine Learning Based Malware Detection Systems
    Kim, Song-Kyoo
    Feng, Xiaomei
    Al Hamadi, Hussam
    Damiani, Ernesto
    Yeun, Chan Yeob
    Nandyala, Sivaprasad
    IEEE ACCESS, 2024, 12 : 115296 - 115305
  • [10] Automatic malware classification and new malware detection using machine learning
    Liu Liu
    Bao-sheng Wang
    Bo Yu
    Qiu-xi Zhong
    Frontiers of Information Technology & Electronic Engineering, 2017, 18 : 1336 - 1347