A Machine Learning-Based Framework with Enhanced Feature Selection and Resampling for Improved Intrusion Detection

被引:3
作者
Malik, Fazila [1 ]
Khan, Qazi Waqas [2 ]
Rizwan, Atif [2 ]
Alnashwan, Rana [3 ]
Atteia, Ghada [3 ]
机构
[1] Iqra Univ Islamabad, Dept Comp Sci, Islamabad 44000, Pakistan
[2] Jeju Natl Univ, Dept Comp Engn, Jejusi 63243, South Korea
[3] Princess Nourah bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Technol, POB 84428, Riyadh 11671, Saudi Arabia
关键词
feature selection; data resampling; intrusion detection; applied machine learning; deep learning; INTERNET;
D O I
10.3390/math12121799
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Intrusion Detection Systems (IDSs) play a crucial role in safeguarding network infrastructures from cyber threats and ensuring the integrity of highly sensitive data. Conventional IDS technologies, although successful in achieving high levels of accuracy, frequently encounter substantial model bias. This bias is primarily caused by imbalances in the data and the lack of relevance of certain features. This study aims to tackle these challenges by proposing an advanced machine learning (ML) based IDS that minimizes misclassification errors and corrects model bias. As a result, the predictive accuracy and generalizability of the IDS are significantly improved. The proposed system employs advanced feature selection techniques, such as Recursive Feature Elimination (RFE), sequential feature selection (SFS), and statistical feature selection, to refine the input feature set and minimize the impact of non-predictive attributes. In addition, this work incorporates data resampling methods such as Synthetic Minority Oversampling Technique and Edited Nearest Neighbor (SMOTE_ENN), Adaptive Synthetic Sampling (ADASYN), and Synthetic Minority Oversampling Technique-Tomek Links (SMOTE_Tomek) to address class imbalance and improve the accuracy of the model. The experimental results indicate that our proposed model, especially when utilizing the random forest (RF) algorithm, surpasses existing models regarding accuracy, precision, recall, and F Score across different data resampling methods. Using the ADASYN resampling method, the RF model achieves an accuracy of 99.9985% for botnet attacks and 99.9777% for Man-in-the-Middle (MITM) attacks, demonstrating the effectiveness of our approach in dealing with imbalanced data distributions. This research not only improves the abilities of IDS to identify botnet and MITM attacks but also provides a scalable and efficient solution that can be used in other areas where data imbalance is a recurring problem. This work has implications beyond IDS, offering valuable insights into using ML techniques in complex real-world scenarios.
引用
收藏
页数:25
相关论文
共 50 条
[1]   Studying Imbalanced Learning for Anomaly-Based Intelligent IDS for Mission-Critical Internet of Things [J].
Abdelmoumin, Ghada ;
Rawat, Danda B. ;
Rahman, Abdul .
JOURNAL OF CYBERSECURITY AND PRIVACY, 2023, 3 (04) :706-743
[2]   Deep learning-based classification model for botnet attack detection [J].
Ahmed, Abdulghani Ali ;
Jabbar, Waheb A. ;
Sadiq, Ali Safaa ;
Patel, Hiran .
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2020, 13 (7) :3457-3466
[3]   IDS-EFS: Ensemble feature selection-based method for intrusion detection system [J].
Akhiat, Yassine ;
Touchanti, Kaouthar ;
Zinedine, Ahmed ;
Chahhou, Mohamed .
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (05) :12917-12937
[4]   Golden jackal optimization algorithm with deep learning assisted intrusion detection system for network security [J].
Aljehane, Nojood O. ;
Mengash, Hanan Abdullah ;
Eltahir, Majdy M. ;
Alotaibi, Faiz Abdullah ;
Aljameel, Sumayh S. ;
Yafoz, Ayman ;
Alsini, Raed ;
Assiri, Mohammed .
ALEXANDRIA ENGINEERING JOURNAL, 2024, 86 :415-424
[5]   Performance Investigation of Principal Component Analysis for Intrusion Detection System Using Different Support Vector Machine Kernels [J].
Almaiah, Mohammed Amin ;
Almomani, Omar ;
Alsaaidah, Adeeb ;
Al-Otaibi, Shaha ;
Bani-Hani, Nabeel ;
Al Hwaitat, Ahmad K. ;
Al-Zahrani, Ali ;
Lutfi, Abdalwali ;
Awad, Ali Bani ;
Aldhyani, Theyazn H. H. .
ELECTRONICS, 2022, 11 (21)
[6]   Deep learning hybridization for improved malware detection in smart Internet of Things [J].
Almazroi, Abdulwahab Ali ;
Ayub, Nasir .
SCIENTIFIC REPORTS, 2024, 14 (01)
[7]   Adversarial Machine Learning Attacks against Intrusion Detection Systems: A Survey on Strategies and Defense [J].
Alotaibi, Afnan ;
Rassam, Murad A. .
FUTURE INTERNET, 2023, 15 (02)
[8]   Developing a hybrid feature selection method to detect botnet attacks in IoT devices [J].
Alshaeaa, Hyder Yahya ;
Ghadhban, Zainab Mohammed .
KUWAIT JOURNAL OF SCIENCE, 2024, 51 (03)
[9]  
Alshamy Reem, 2021, Advances in Cyber Security: Third International Conference, ACeS 2021, Penang, Malaysia, August 24-25, 2021, Revised Selected Papers. Communications in Computer and Information Science (1487), P361, DOI 10.1007/978-981-16-8059-5_22
[10]  
Arik SO, 2021, AAAI CONF ARTIF INTE, V35, P6679