GeNIS: A modular dataset for network intrusion detection and classification

被引:0
|
作者
Silva, Miguel [1 ]
Pinto, Daniela [1 ]
Vitorino, Joao [1 ]
Goncalves, Jose [1 ]
Maia, Eva [1 ]
Praca, Isabel [1 ]
机构
[1] Polytech Porto ISEP IPP, Sch Engn, Res Grp Intelligent Engn & Comp Adv Innovat & Dev, P-4249015 Porto, Portugal
来源
DATA IN BRIEF | 2025年 / 60卷
关键词
Network flow; Packet capture; Attack classification; Anomaly detection; Machine learning; Cybersecurity; Dataset;
D O I
10.1016/j.dib.2025.111487
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The development of artificial intelligence solutions for cyberattack detection and classification require high-quality and representative data. However, there is a scarcity of labelled datasets focused on the cyberattacks that target vulnerable small and medium-sized enterprises. To allow organizations to improve their intrusion detection systems according to their types of users, their active services, and the network protocols they use, it is necessary to provide reliable captures of different types of benign and malicious traffic. The GECAD Network Intrusion Scenarios (GeNIS) dataset contains multiple sequential attack scenarios and different types of realistic normal network activity, recorded during advanced network simulations on the Airbus CyberRange platform. The raw network packets were analyzed to generate labelled network flows, with the computation of statistical features to represent the traffic patterns of local and remote attackers, normal users and administrators, and background traffic of an enterprise computer network. GeNIS follows a modular design, providing raw packet capture next generation (PCAPNG) files with over 37 million packets of each intermediate attack step to enable an in-depth analysis with different flow exporters, feature extraction, and feature selection tools, as well as filtered CSV files with over 2.8 million flows created with 5, 10, 30, and 60 s flow intervals. The flows were preprocessed to provide a reliable benchmark dataset with the most relevant features for the training, validation, and testing of robust machine learning and deep learning models.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Koga2022 Dataset: Comprehensive Dataset with Detailed Classification for Network Intrusion Detection Systems
    Sato, Hideya
    Kobayashi, Ryotaro
    2022 TENTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING WORKSHOPS, CANDARW, 2022, : 351 - 357
  • [2] Realistic Computer Network Simulation for Network Intrusion Detection Dataset Generation
    Payer, Garrett
    NEXT-GENERATION ROBOTICS II; AND MACHINE INTELLIGENCE AND BIO-INSPIRED COMPUTATION: THEORY AND APPLICATIONS IX, 2015, 9494
  • [3] Comparison of Classification Techniques for Intrusion Detection Dataset Using WEKA
    Garg, Tanya
    Khurana, Surinder Singh
    2014 RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (ICRAIE), 2014,
  • [4] Classification of Intrusion Detection Dataset using machine learning Approaches
    Subramanyam, Doodipalli
    PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON COMPUTATIONAL TECHNIQUES, ELECTRONICS AND MECHANICAL SYSTEMS (CTEMS), 2018, : 280 - 283
  • [5] UNR-IDD: Intrusion Detection Dataset using Network Port Statistics
    Das, Tapadhir
    Abu Hamdan, Osama
    Shukla, Raj Mani
    Sengupta, Shamik
    Arslan, Engin
    2023 IEEE 20TH CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE, CCNC, 2023,
  • [6] The Proposition and Evaluation of the RoEduNet-SIMARGL2021 Network Intrusion Detection Dataset
    Mihailescu, Maria-Elena
    Mihai, Darius
    Carabas, Mihai
    Komisarek, Mikolaj
    Pawlicki, Marek
    Holubowicz, Witold
    Kozik, Rafal
    SENSORS, 2021, 21 (13)
  • [7] Late Fusion for Improving Intrusion Detection in a Network Traffic Dataset
    Salazar, Addisson
    Vargas, Nancy
    Safont, Gonzalo
    Vergara, Luis
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 1684 - 1689
  • [8] Comparison of classification techniques applied for network intrusion detection and classification
    Aziz, Amira Sayed A.
    EL-Ola Hanafi, Sanaa
    Hassanien, Aboul Ella
    JOURNAL OF APPLIED LOGIC, 2017, 24 : 109 - 118
  • [9] Farm-flow dataset: Intrusion detection in smart agriculture based on network flows
    Ferreira, Rafael
    Bispo, Ivo
    Rabadao, Carlos
    Santos, Leonel
    Costa, Rogerio Luis de C.
    COMPUTERS & ELECTRICAL ENGINEERING, 2025, 121
  • [10] Machine Learning Classification Model For Network Based Intrusion Detection System
    Kumar, Sanjay
    Viinikainen, Ari
    Hamalainen, Timo
    2016 11TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2016, : 242 - 249