A2-CLM: Few-Shot Malware Detection Based on Adversarial Heterogeneous Graph Augmentation

被引:3
作者
Liu, Chen [1 ]
Li, Bo [1 ,2 ]
Zhao, Jun [3 ]
Feng, Weiwei [1 ]
Liu, Xudong [1 ,2 ]
Li, Chunpei [4 ,5 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
[2] Zhongguancun Lab, Beijing 100094, Peoples R China
[3] Shandong Normal Univ, Sch Informat Sci & Engn, Jinan 250014, Peoples R China
[4] Guangxi Normal Univ, Sch Comp Sci & Engn, Guilin 541004, Peoples R China
[5] Guangxi Normal Univ, Sch Software, Guilin 541004, Peoples R China
关键词
Malware; Behavioral sciences; Task analysis; Sensitivity; Semantics; Feature extraction; Training; Few-shot malware detection; security semantic; graph contrastive learning; adversarial augmentation;
D O I
10.1109/TIFS.2023.3345640
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Malware attacks, especially "few-shot" malware, have profoundly harmed the cyber ecosystem. Recently, malware detection models based on graph neural networks have achieved remarkable success. However, these efforts over-rely on sufficient labeled data for model training and thus may be brittle in few-shot malware detection because of the label scarcity. To this end, we propose a self-supervised malware detection framework based on graph contrastive learning and adversarial augmentation, termed A2-CLM, to address the challenge of few-shot malware detection. Particularly, A2-CLM first depicts the malware execution context with a sensitivity heterogeneous graph by assessing the security semantic of each behavior. Afterwards, A2-CLM designs multiple adversarial attacks to generate more practical contrastive pairs, including the PGD attack, attribute masking attack, meta-graph-guide sampling attack, direct system calls attack, and obfuscation attack, which is beneficial to strengthening the model's effectiveness and robustness. To alleviate the training workload of contrastive learning, we introduce a momentum strategy to train the multiple graph encoders in A2-CLM. Especially on 1-shot detection tasks, A2-CLM achieves performance gains of up to 24.63% and 4.58% against supervised and self-supervised detection methods, respectively.
引用
收藏
页码:2023 / 2038
页数:16
相关论文
共 61 条
[1]   VisualPhishNet: Zero-Day PhishingWebsite Detection by Visual Similarity [J].
Abdelnabi, Sahar ;
Krombholz, Katharina ;
Fritz, Mario .
CCS '20: PROCEEDINGS OF THE 2020 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2020, :1681-1698
[2]   PermPair: Android Malware Detection Using Permission Pairs [J].
Arora, Anshul ;
Peddoju, Sateesh K. ;
Conti, Mauro .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2020, 15 :1968-1982
[3]  
Arp D, 2022, PROCEEDINGS OF THE 31ST USENIX SECURITY SYMPOSIUM, P3971
[4]  
AvTest, 2020, Malware Statistics
[5]   Unsuccessful Story about Few Shot Malware Family Classification and Siamese Network to the Rescue [J].
Bai, Yude ;
Xing, Zhenchang ;
Li, Xiaohong ;
Feng, Zhiyong ;
Ma, Duoyuan .
2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, :1560-1571
[6]   Code obfuscation techniques for metamorphic viruses [J].
Borello, Jean-Marie ;
Me, Ludovic .
JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES, 2008, 4 (03) :211-220
[7]   A graph distance metric based on the maximal common subgraph [J].
Bunke, H ;
Shearer, K .
PATTERN RECOGNITION LETTERS, 1998, 19 (3-4) :255-259
[8]   AIMED: Evolving Malware with Genetic Programming to Evade Detection [J].
Castro, Raphael Labaca ;
Schmitt, Corinna ;
Rodosek, Gabi Dreo .
2019 18TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS/13TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (TRUSTCOM/BIGDATASE 2019), 2019, :240-247
[9]   Dynamic Prototype Network Based on Sample Adaptation for Few-Shot Malware Detection [J].
Chai, Yuhan ;
Du, Lei ;
Qiu, Jing ;
Yin, Lihua ;
Tian, Zhihong .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) :4754-4766
[10]  
Chen T., 2020, PMLR, ppp 1597