Intrusion Detection Model Based on TF.IDF and C4.5 Algorithms

被引:6
作者
Awadh, Khaldoon [1 ]
Akbas, Ayhan [2 ]
机构
[1] Univ Turkish Aeronaut Assoc, Comp Engn Dept, Ankara, Turkey
[2] Cankiri Karatekin Univ, Comp Engn Dept, Cankiri, Turkey
来源
JOURNAL OF POLYTECHNIC-POLITEKNIK DERGISI | 2021年 / 24卷 / 04期
关键词
IDS; TF.IDF; data mining; machine learning; network security;
D O I
10.2339/politeknik.693221
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In recent years, the use of machine learning and data mining technologies has drawn researchers' attention to new ways to improve the performance of Intrusion Detection Systems (IDS). These techniques have proven to be an effective method in distinguishing malicious network packets. One of the most challenging problems that researchers are faced with is the transformation of data into a form that can be handled effectively by Machine Learning Algorithms (MLA). In this paper, we present an IDS model based on the decision tree C4.5 algorithm with transforming simulated UNSW-NB15 dataset as a pre-processing operation. Our model uses Term Frequency.Inverse Document Frequency (TF.IDF) to convert data types to an acceptable and efficient form for machine learning to achieve high detection performance. The model has been tested with randomly selected 250000 records of the UNSW-NB15 dataset. Selected records have been grouped into various segment sizes, like 50, 500, 1000, and 5000 items. Each segment has been, further, grouped into two subsets of multi and binary class datasets. The performance of the Decision Tree C4.5 algorithm with Multilayer Perceptron (MLP) and Naive Bayes (NB) has been compared in Weka software. Our proposed method significantly has improved the accuracy of classifiers and decreased incorrectly detected instances. The increase in accuracy reflects the efficiency of transforming the dataset with TF.IDF of various segment sizes.
引用
收藏
页码:1691 / 1698
页数:8
相关论文
共 50 条
  • [41] Comparative Analysis of Machine Learning Algorithms for Email Phishing Detection Using TF-IDF, Word2Vec, and BERT
    Al Tawil, Arar
    Almazaydeh, Laiali
    Qawasmeh, Doaa
    Qawasmeh, Baraah
    Alshinwan, Mohammad
    Elleithy, Khaled
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 81 (02): : 3395 - 3412
  • [42] Research on hybrid intrusion detection method based on the ADASYN and ID3 algorithms
    Li Y.
    Xu W.
    Li W.
    Li A.
    Liu Z.
    Mathematical Biosciences and Engineering, 2021, 19 (02) : 2030 - 2042
  • [43] Comparative Analysis of Machine Learning Algorithms Based on the Outcome of Proactive Intrusion Detection System
    Abirami, Sivaprasad
    Palanikumar, S.
    HELIX, 2020, 10 (05): : 32 - 37
  • [44] Research on hybrid intrusion detection method based on the ADASYN and ID3 algorithms
    Li, Yue
    Xu, Wusheng
    Li, Wei
    Li, Ang
    Liu, Zengjin
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2022, 19 (02) : 2030 - 2042
  • [45] An Intrusion Detection Approach based on the Combination of Oversampling and Undersampling Algorithms
    Arik, Ahmet Okan
    Cavdaroglu, Gulsum Cigdem
    ACTA INFOLOGICA, 2023, 7 (01): : 125 - 138
  • [46] Improved Intrusion Detection Algorithm based on TLBO and GA Algorithms
    Aljanabi, Mohammad
    Ismail, MohdArfian
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2021, 18 (02) : 170 - 179
  • [47] Platform Management System Host-Based Anomaly Detection using TF-IDF and an LSTM Autoencoder
    Coote, Emilie
    Lachine, Brian
    MILCOM 2023 - 2023 IEEE MILITARY COMMUNICATIONS CONFERENCE, 2023,
  • [48] Intrusion detection based on Hidden Markov Model
    Yin, QB
    Shen, LR
    Zhang, RB
    Li, XY
    Wang, HQ
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 3115 - 3118
  • [49] Sampling method based on improved C4.5 decision tree and its application in prediction of telecom customer churn
    Deng W.
    Deng L.
    Liu J.
    Qi J.
    International Journal of Information Technology and Management, 2019, 18 (01): : 93 - 109
  • [50] A Feature Selection Model for Network Intrusion Detection System Based on PSO, GWO, FFA and GA Algorithms
    Almomani, Omar
    SYMMETRY-BASEL, 2020, 12 (06): : 1 - 20