Malicious Domain Name Detection Method Based on Graph Contrastive Learning

被引:0
作者
Zhang, Zhen [1 ]
Zhang, San-Feng [1 ,2 ]
Yang, Wang [1 ,2 ]
机构
[1] School of Cyber Science and Engineering, Southeast University, Nanjing
[2] Key Laboratory of Computer Network and Information Integration of Ministry of Education, Southeast University, Nanjing
来源
Ruan Jian Xue Bao/Journal of Software | 2024年 / 35卷 / 10期
关键词
asymmetric coding; attribute heterogeneous graph; graph neural network (GNN); malicious domain name detection; self-supervised learning;
D O I
10.13328/j.cnki.jos.006964
中图分类号
学科分类号
摘要
The domain name plays an important role in cybercrimes. Existing malicious domain name detection methods are not only difficult to use with rich topology and attribute information but also require a large amount of label data, resulting in limited detection effects and high costs. To address this problem, this study proposes a malicious domain name detection method based on graph contrastive learning. The domain name and IP address are taken as two types of nodes in a heterogeneous graph, and the feature matrix of corresponding nodes is established according to their attributes. Three types of meta paths are constructed based on the inclusion relationship between domain names, the measure of similarity, and the correspondence between domain names and IP addresses. In the pretraining stage, the contrast learning model based on the asymmetric encoder is applied to avoid the damage to graph structure and semantics caused by graph data augmentation operation and reduce the demand for computing resources. By using the inductive graph neural network graph encoders HeteroSAGE and HeteroGAT, a node-centric mini-batch training strategy is adopted to explore the aggregation relationship between the target node and its neighbor nodes, which solves the problem of poor applicability of the transductive graph neural networks in dynamic scenarios. The downstream classification detection task contrastively utilizes logistic regression and random forest algorithms. Experimental results on publicly available data sets show that detection performance is improved by two to six percentage points compared with that of related works. © 2024 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:4837 / 4858
页数:21
相关论文
共 38 条
  • [1] Liu WF, Zhang Y, Zhang HL, Fang BX., Survey on domain name system measurement research, Ruan Jian Xue Bao/Journal of Software, 33, 1, pp. 211-232, (2022)
  • [2] Fan ZS, Wang Q, Liu JR, Cui ZL, Liu YL, Liu S., Survey on domain name abuse detection technology, Journal of Computer Research and Development, 59, 11, pp. 2581-2605, (2022)
  • [3] Plohmann D, Yakdan K, Klatt M, Bader J, Gerhards-Padilla E., A comprehensive measurement study of domain generating malware, Proc. of the 25th USENIX Int’l Symp. on Security, pp. 263-278, (2016)
  • [4] Yadav S, Reddy AKK, Reddy ALN, Ranjan S., Detecting algorithmically generated domain-flux attacks with DNS traffic analysis, IEEE/ACM Trans. on Networking, 20, 5, pp. 1663-1677, (2012)
  • [5] Holz T, Gorecki C, Rieck K, Freiling FC., Measuring and detecting fast-flux service networks, Proc. of the 2008 NDSS Int’l Symp. on Network and Distributed System Security, pp. 1-12, (2008)
  • [6] Zhauniarovich Y, Khalil I, Yu T, Dacier M., A survey on malicious domains detection through DNS data analysis, ACM Computing Surveys, 51, 4, (2019)
  • [7] Han CY, Zhang YZ, Zhang Y., Fast-flucos: Malicious domain name detection method for Fast-flux based on DNS traffic, Journal on Communications, 41, 5, pp. 37-47, (2020)
  • [8] Zhang WW, Gong J, Liu Q, Liu SD, Hu XY., Lightweight domain name detection algorithm based on morpheme features, Ruan Jian Xue Bao/Journal of Software, 27, 9, pp. 2348-2364, (2016)
  • [9] Zhang B, Liao RJ., Malicious domain name detection method based on associated information extraction, Journal on Communications, 42, 10, pp. 162-172, (2021)
  • [10] Sun XQ, Tong MK, Yang JH, Liu XR, Liu H., HinDom: A robust malicious domain detection system based on heterogeneous information network with transductive classification, Proc. of the 22nd Int’l Symp. on Research in Attacks, Intrusions and Defenses, pp. 399-412, (2019)