ProvSec: Open Cybersecurity System Provenance Analysis Benchmark Dataset with Labels

被引:0
作者
Shrestha, Madhukar [1 ]
Kim, Yonghyun [1 ]
Oh, Jeehyun [1 ]
Rhee, Junghwan [1 ]
Choe, Yung Ryn [2 ]
Zuo, Fei [1 ]
Park, Myungah [1 ]
Qian, Gang [1 ]
机构
[1] Univ Cent Oklahoma, Comp Sci Dept, 100 Univ North Dr, Edmond, OK 73034 USA
[2] Sandia Natl Labs, POB 969 MS 9105, Livermore, CA 94551 USA
关键词
Provenance; Dataset; Attack; Backtracking; VIDEO; BLOCKCHAIN; FEATURES;
D O I
10.1007/s44227-023-00014-9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
System provenance forensic analysis has been studied by a large body of research work. This area needs fine granularity data such as system calls along with event fields to track the dependencies of events. While prior work on security datasets has been proposed, we found a useful dataset of realistic attacks and details that are needed for high-quality provenance tracking is lacking. We created a new dataset of eleven vulnerable cases for system forensic analysis. It includes the full details of system calls including syscall parameters. Realistic attack scenarios with real software vulnerabilities and exploits are used. For each case, we created two sets of benign and adversary scenarios which are manually labeled for supervised machine-learning analysis. In addition, we present an algorithm to improve the data quality in the system provenance forensic analysis. We demonstrate the details of the dataset events and dependency analysis of our dataset cases.
引用
收藏
页码:112 / 123
页数:12
相关论文
共 52 条
  • [11] DARPA, 2021, Operationally transparent cyber (optc) data release
  • [12] Gehani Ashish, 2012, Middleware 2012. ACM/IFIP/USENIX 13th International Middleware Conference. Proceedings, P101, DOI 10.1007/978-3-642-35170-9_6
  • [13] VulinOSS: A Dataset of Security Vulnerabilities in Open-source Systems
    Gkortzis, Antonios
    Mitropoulos, Dimitris
    Spinellis, Diomidis
    [J]. 2018 IEEE/ACM 15TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR), 2018, : 18 - 21
  • [14] Hassan WU, 2020, ANN COMP SEC APPL C
  • [15] Combating Dependence Explosion in Forensic Analysis Using Alternative Tag Propagation Semantics
    Hossain, Md Nahid
    Sheikhi, Sanaz
    Sekar, R.
    [J]. 2020 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2020), 2020, : 1139 - 1155
  • [16] Hossain MN, 2018, PROCEEDINGS OF THE 27TH USENIX SECURITY SYMPOSIUM, P1723
  • [17] Hossain MN, 2017, PROCEEDINGS OF THE 26TH USENIX SECURITY SYMPOSIUM (USENIX SECURITY '17), P487
  • [18] SoK: History is a Vast Early Warning System: Auditing the Provenance of System Intrusions
    Inam, Muhammad Adil
    Chen, Yinfang
    Goyal, Akul
    Liu, Jason
    Mink, Jaron
    Michael, Noor
    Gaur, Sneha
    Bates, Adam
    Hassan, Wajih Ul
    [J]. 2023 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP, 2023, : 2620 - 2638
  • [19] Millions of Targets Under Attack: a Macroscopic Characterization of the DoS Ecosystem
    Jonker, Mattijs
    King, Alistair
    Krupp, Johannes
    Rossow, Christian
    Sperotto, Anna
    Dainotti, Alberto
    [J]. PROCEEDINGS OF THE 2017 INTERNET MEASUREMENT CONFERENCE (IMC'17), 2017, : 100 - 113
  • [20] Kim D, 2020, arXiv