SSCL-TransMD: Semi-Supervised Continual Learning Transformer for Malicious Software Detection

被引：1

作者：

Kou, Liang ^{[1
,2
]}

Zhao, Donghui ^{[2
]}

Han, Hui ^{[1
]}

Xu, Xiong ^{[1
]}

Gong, Shuaige ^{[1
]}

Wang, Liandong ^{[1
]}

机构：

[1] State Key Lab Complex Electromagnet Environm Effec, Luoyang 471000, Peoples R China

[2] Hangzhou Dianzi Univ, Coll Cyberspace, Hangzhou 310018, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 22期

关键词：

android malware detection; deep learning; transformer; semi-supervised continual learning; MALWARE DETECTION;

D O I：

10.3390/app132212255

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Machine learning-based malware (malicious software) detection methods have a wide range of real-world applications. However, these types of approaches suffer from the fatal problem of "model aging", in which the validity of the model decreases rapidly as the malware continues to evolve and variants emerge continuously. The model aging problem is usually solved by model retraining, which relies on lots of labeled samples obtained at great expense. To address this challenge, this paper proposes a semi-supervised continuous learning malware detection model based on Transformer. Firstly, this model improves the lifelong semi-supervised mixture algorithm to dynamically adjust the weighted combination of new sample sequences and historical ones to solve the imbalance problem. Secondly, the Learning with Local and Global Consistency algorithm is used to iteratively compute similarity scores for the unlabeled samples in the mixed samples to obtain pseudo-labels. Lastly, the Multilayer Perceptron is applied for malware classification. To validate the effectiveness of the model, this paper conducts experiments on the CICMalDroid2020 dataset. The experimental results show that the proposed model performs better than existing deep learning detection models. The F1 score has an average improvement of 1.27% compared to other models when conducting binary classification. And, after inputting hybrid samples, including historical data and new data, four times, the F1 score is still 1.96% higher than other models.

引用

页数：21

共 50 条

[1] Semi-supervised machine learning approach for unknown malicious software detection
Bisio, Federica
Gastaldo, Paolo
Zunino, Rodolfo
Decherchi, Sergio
2014 IEEE INTERNATIONAL SYMPOSIUM ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (INISTA 2014), 2014, : 52 - 59
[2] SSCL: Semi-supervised Contrastive Learning for Industrial Anomaly Detection
Cai, Wei
Gao, Jiechao
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IV, 2024, 14428 : 100 - 112
[3] Learning to Predict Gradients for Semi-Supervised Continual Learning
Luo, Yan
Wong, Yongkang
Kankanhalli, Mohan
Zhao, Qi
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 2593 - 2607
[4] Semi-supervised learning approach for malicious URL detection via adversarial learning
Ling, Jie
Xiong, Su
Luo, Yu
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (02) : 3083 - 3092
[5] Malicious domain detection based on semi-supervised learning and parameter optimization
Liao, Renjie
Wang, Shuo
IET COMMUNICATIONS, 2024, 18 (06) : 386 - 397
[6] CONTRASTIVE LEARNING FOR ONLINE SEMI-SUPERVISED GENERAL CONTINUAL LEARNING
Michel, Nicolas
Negrel, Romain
Chierchia, Giovanni
Bercher, Jean-Francois
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1896 - 1900
[7] SPIDER: A Semi-Supervised Continual Learning-based Network Intrusion Detection System
Amalapuram, Suresh Kumar
Tamma, Bheemarjuna Reddy
Channappayya, Sumohana S.
IEEE INFOCOM 2024-IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, 2024, : 571 - 580
[8] Semi-supervised Continual Learning with Meta Self-training
Ho, Stella
Liu, Ming
Du, Lan
Li, Yunfeng
Gao, Longxiang
Gao, Shang
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4024 - 4028
[9] Continual semi-supervised learning through contrastive interpolation consistency
Boschini, Matteo
Buzzega, Pietro
Bonicelli, Lorenzo
Porrello, Angelo
Calderara, Simone
PATTERN RECOGNITION LETTERS, 2022, 162 : 9 - 14
[10] CNLL: A Semi-supervised Approach For Continual Noisy Label Learning
Karim, Nazmul
Khalid, Umar
Esmaeili, Ashkan
Rahnavard, Nazanin
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3877 - 3887

← 1 2 3 4 5 →