An Open Dataset of Cyber Asset Graphs for Cybercrime Research

被引:0
作者
Zhao, Xin [1 ]
Li, Shaolong [1 ]
Zhao, Ying [1 ]
Fu, Shuowen [1 ]
Chen, Yunpeng [1 ]
Zhou, Fangfang [1 ]
Huang, Xin [2 ]
Li, Yuwei [2 ]
Chen, Zhuo [2 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
[2] Qi An Xin Technol Grp Inc, Beijing 100015, Peoples R China
基金
中国国家自然科学基金;
关键词
Cyber asset; cybercrime; graph; open dataset;
D O I
10.1109/TBDATA.2024.3403371
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cybercrime poses a severe threat to the entire Internet ecosystem. Various cyber assets, such as domain name, IP address, and security certificate, are staple infrastructures of cybercrime. A cyber asset graph (CAG) is a collection of closely related cyber assets held by a cybercrime gang to support online criminal activities. Analyzing CAGs provides rich data insights for cybercrime investigation and governance. This paper introduces an open dataset of CAGs comprised of 2.37 million nodes with eight types of cyber assets and 3.28 million edges with eleven types of relations. This paper introduces the dataset construction process, applied areas, and the experience of using the dataset in the ChinaVis Data Challenge 2022. This dataset contains numerous CAGs of cybercrime gangs in the real world, which is the first open dataset of CAGs for cybercrime research. This dataset can also support the development of other application-oriented areas, such as cyber asset management and cyber-physical-social system, and various graph-related research areas, such as graph theory, graph mining, and graph visualization.
引用
收藏
页码:438 / 446
页数:9
相关论文
共 25 条
  • [1] [Anonymous], 2020, DNS Toolkit for Python
  • [2] [Anonymous], 2020, Bigdata Services of Domain's Whois
  • [3] Protein function prediction via graph kernels
    Borgwardt, KM
    Ong, CS
    Schönauer, S
    Vishwanathan, SVN
    Smola, AJ
    Kriegel, HP
    [J]. BIOINFORMATICS, 2005, 21 : I47 - I56
  • [4] Chen W., 2021, Front Inf. Technol., V22
  • [5] chinavis, 2022, ChinaVis Data Challenge
  • [6] Collier B., 2020, P WORKSH EC INF SEC, P1
  • [7] Deora R S., 2021, Journal of communication engineering Systems, V11, P1, DOI DOI 10.37591/JOCES
  • [8] Distinguishing enzyme structures from non-enzymes without alignments
    Dobson, PD
    Doig, AJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2003, 330 (04) : 771 - 783
  • [9] Big Data Applications in Governance and Policy
    Giest, Sarah
    Ng, Reuben
    [J]. POLITICS AND GOVERNANCE, 2018, 6 (04): : 1 - 4
  • [10] github, 2022, CAG-CR-22 dataset