VulnMiner: A comprehensive framework for vulnerability collection from C/C++ source code projects

被引:0
作者
Bhandari, Guru [1 ]
Gavric, Nikola [1 ]
Shalaginov, Andrii [1 ]
机构
[1] Kristiania Univ Coll, Cybersecur Dept, Oslo, Norway
关键词
Vulnerability extraction tool; Static security analyzers; Vulnerabilities dataset; Source code; Machine learning; C/C++ code;
D O I
10.1016/j.simpa.2024.100713
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The study introduces VulnMiner, a comprehensive framework encompassing a data extraction tool tailored for identifying vulnerabilities in C/C++ source code. Moreover, it unveils an initial release of a vulnerability dataset, curated from prevalent projects and annotated with vulnerable and benign instances. This dataset incorporates projects with vulnerabilities labeled as Common Weakness Enumeration (CWE) categories. The developed open-source extraction tool collects vulnerability data utilizing static security analyzers. The study also fosters the machine learning (ML) and natural language processing (NLP) model's effectiveness in accurately classifying vulnerabilities, evidenced by its identification of numerous weaknesses in open-source projects.
引用
收藏
页数:4
相关论文
共 17 条
  • [1] iDetect for vulnerability detection in internet of things operating systems using machine learning
    Al-Boghdady, Abdullah
    El-Ramly, Mohammad
    Wassif, Khaled
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [2] Alnaeli S.M., 2017, Adv. Sci. Technol. Eng. Syst. J., V2, P1502, DOI [10.25046/aj0203188, DOI 10.25046/AJ0203188]
  • [3] IoTvulCode: AI-enabled vulnerability detection in software products designed for IoT applications
    Bhandari, Guru Prasad
    Assres, Gebremariam
    Gavric, Nikola
    Shalaginov, Andrii
    Gronli, Tor-Morten
    [J]. INTERNATIONAL JOURNAL OF INFORMATION SECURITY, 2024, 23 (04) : 2677 - 2690
  • [4] Celik ZB, 2018, PROCEEDINGS OF THE 27TH USENIX SECURITY SYMPOSIUM, P1687
  • [5] DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection
    Chen, Yizheng
    Ding, Zhoujie
    Alowain, Lamya
    Chen, Xinyun
    Wagner, David
    [J]. PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON RESEARCH IN ATTACKS, INTRUSIONS AND DEFENSES, RAID 2023, 2023, : 654 - 668
  • [6] srcML: An Infrastructure for the Exploration, Analysis, and Manipulation of Source Code A Tool Demonstration
    Collard, Michael L.
    Decker, Michael John
    Maletic, Jonathan I.
    [J]. 2013 29TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE (ICSM), 2013, : 516 - 519
  • [7] Cppcheck2.1, 2021, A tool for static C/C++ code analysis
  • [8] Dwheeler, 2021, Flawfinder v. 2.0.11
  • [9] A C/C plus plus Code Vulnerability Dataset with Code Changes and CVE Summaries
    Fan, Jiahao
    Li, Yi
    Wang, Shaohua
    Nguyen, Tien N.
    [J]. 2020 IEEE/ACM 17TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2020, : 508 - 512
  • [10] Feng ZY, 2020, Arxiv, DOI [arXiv:2002.08155, 10.48550/arXiv.2002.08155, DOI 10.48550/ARXIV.2002.08155]