SSCL-TransMD: Semi-Supervised Continual Learning Transformer for Malicious Software Detection

被引:1
|
作者
Kou, Liang [1 ,2 ]
Zhao, Donghui [2 ]
Han, Hui [1 ]
Xu, Xiong [1 ]
Gong, Shuaige [1 ]
Wang, Liandong [1 ]
机构
[1] State Key Lab Complex Electromagnet Environm Effec, Luoyang 471000, Peoples R China
[2] Hangzhou Dianzi Univ, Coll Cyberspace, Hangzhou 310018, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 22期
关键词
android malware detection; deep learning; transformer; semi-supervised continual learning; MALWARE DETECTION;
D O I
10.3390/app132212255
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Machine learning-based malware (malicious software) detection methods have a wide range of real-world applications. However, these types of approaches suffer from the fatal problem of "model aging", in which the validity of the model decreases rapidly as the malware continues to evolve and variants emerge continuously. The model aging problem is usually solved by model retraining, which relies on lots of labeled samples obtained at great expense. To address this challenge, this paper proposes a semi-supervised continuous learning malware detection model based on Transformer. Firstly, this model improves the lifelong semi-supervised mixture algorithm to dynamically adjust the weighted combination of new sample sequences and historical ones to solve the imbalance problem. Secondly, the Learning with Local and Global Consistency algorithm is used to iteratively compute similarity scores for the unlabeled samples in the mixed samples to obtain pseudo-labels. Lastly, the Multilayer Perceptron is applied for malware classification. To validate the effectiveness of the model, this paper conducts experiments on the CICMalDroid2020 dataset. The experimental results show that the proposed model performs better than existing deep learning detection models. The F1 score has an average improvement of 1.27% compared to other models when conducting binary classification. And, after inputting hybrid samples, including historical data and new data, four times, the F1 score is still 1.96% higher than other models.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Semi-supervised machine learning approach for unknown malicious software detection
    Bisio, Federica
    Gastaldo, Paolo
    Zunino, Rodolfo
    Decherchi, Sergio
    2014 IEEE INTERNATIONAL SYMPOSIUM ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (INISTA 2014), 2014, : 52 - 59
  • [2] SSCL: Semi-supervised Contrastive Learning for Industrial Anomaly Detection
    Cai, Wei
    Gao, Jiechao
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IV, 2024, 14428 : 100 - 112
  • [3] Learning to Predict Gradients for Semi-Supervised Continual Learning
    Luo, Yan
    Wong, Yongkang
    Kankanhalli, Mohan
    Zhao, Qi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 2593 - 2607
  • [4] Semi-supervised learning approach for malicious URL detection via adversarial learning
    Ling, Jie
    Xiong, Su
    Luo, Yu
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (02) : 3083 - 3092
  • [5] Malicious domain detection based on semi-supervised learning and parameter optimization
    Liao, Renjie
    Wang, Shuo
    IET COMMUNICATIONS, 2024, 18 (06) : 386 - 397
  • [6] CONTRASTIVE LEARNING FOR ONLINE SEMI-SUPERVISED GENERAL CONTINUAL LEARNING
    Michel, Nicolas
    Negrel, Romain
    Chierchia, Giovanni
    Bercher, Jean-Francois
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1896 - 1900
  • [7] SPIDER: A Semi-Supervised Continual Learning-based Network Intrusion Detection System
    Amalapuram, Suresh Kumar
    Tamma, Bheemarjuna Reddy
    Channappayya, Sumohana S.
    IEEE INFOCOM 2024-IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, 2024, : 571 - 580
  • [8] Semi-supervised Continual Learning with Meta Self-training
    Ho, Stella
    Liu, Ming
    Du, Lan
    Li, Yunfeng
    Gao, Longxiang
    Gao, Shang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4024 - 4028
  • [9] Continual semi-supervised learning through contrastive interpolation consistency
    Boschini, Matteo
    Buzzega, Pietro
    Bonicelli, Lorenzo
    Porrello, Angelo
    Calderara, Simone
    PATTERN RECOGNITION LETTERS, 2022, 162 : 9 - 14
  • [10] CNLL: A Semi-supervised Approach For Continual Noisy Label Learning
    Karim, Nazmul
    Khalid, Umar
    Esmaeili, Ashkan
    Rahnavard, Nazanin
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3877 - 3887