IoT Intrusion Detection Using Machine Learning with a Novel High Performing Feature Selection Method

被引:77
作者
Albulayhi, Khalid [1 ]
Abu Al-Haija, Qasem [2 ]
Alsuhibany, Suliman A. [3 ]
Jillepalli, Ananth A. [4 ]
Ashrafuzzaman, Mohammad [5 ]
Sheldon, Frederick T. [1 ]
机构
[1] Univ Idaho, Comp Sci Dept, Moscow, ID 83844 USA
[2] Princess Sumaya Univ Technol PSUT, Dept Comp Sci Cybersecur, Amman 11941, Jordan
[3] Qassim Univ, Coll Comp, Dept Comp Sci, Buraydah 51452, Saudi Arabia
[4] Washington State Univ, Sch Elect Engn & Comp Sci, Pullman, WA 99164 USA
[5] Ashland Univ, Dept Math & Comp Sci, Ashland, OH 44805 USA
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 10期
关键词
cybersecurity; anomaly detection accuracy; feature selection; Internet of Things (IoT); intrusion detection system; and machine learning; DETECTION SYSTEM; MUTUAL INFORMATION; INTERNET; MODEL;
D O I
10.3390/app12105015
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The Internet of Things (IoT) ecosystem has experienced significant growth in data traffic and consequently high dimensionality. Intrusion Detection Systems (IDSs) are essential self-protective tools against various cyber-attacks. However, IoT IDS systems face significant challenges due to functional and physical diversity. These IoT characteristics make exploiting all features and attributes for IDS self-protection difficult and unrealistic. This paper proposes and implements a novel feature selection and extraction approach (i.e., our method) for anomaly-based IDS. The approach begins with using two entropy-based approaches (i.e., information gain (IG) and gain ratio (GR)) to select and extract relevant features in various ratios. Then, mathematical set theory (union and intersection) is used to extract the best features. The model framework is trained and tested on the IoT intrusion dataset 2020 (IoTID20) and NSL-KDD dataset using four machine learning algorithms: Bagging, Multilayer Perception, J48, and IBk. Our approach has resulted in 11 and 28 relevant features (out of 86) using the intersection and union, respectively, on IoTID20 and resulted 15 and 25 relevant features (out of 41) using the intersection and union, respectively, on NSL-KDD. We have further compared our approach with other state-of-the-art studies. The comparison reveals that our model is superior and competent, scoring a very high 99.98% classification accuracy.
引用
收藏
页数:30
相关论文
共 78 条
[61]  
Quinlan J. R., 1986, Machine Learning, V1, P81, DOI 10.1023/A:1022643204877
[62]  
Sapre S., 2019, ARXIV
[63]  
Sarker I.H., 2021, SN Comput. Sci, V3, P154, DOI DOI 10.1007/S42979-021-00535-6
[64]   Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions [J].
Sarker I.H. .
SN Computer Science, 2021, 2 (6)
[65]   An Agile Approach to Identify Single and Hybrid Normalization for Enhancing Machine Learning-Based Network Intrusion Detection [J].
Siddiqi, Murtaza Ahmed ;
Pak, Wooguil .
IEEE ACCESS, 2021, 9 :137494-137513
[66]   Optimizing Filter-Based Feature Selection Method Flow for Intrusion Detection System [J].
Siddiqi, Murtaza Ahmed ;
Pak, Wooguil .
ELECTRONICS, 2020, 9 (12) :1-18
[67]   Decision tree based light weight intrusion detection using a wrapper approach [J].
Sindhu, Siva S. Sivatha ;
Geetha, S. ;
Kannan, A. .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (01) :129-141
[68]   A Comprehensive Survey on Cyber-Physical Smart Grid Testbed Architectures: Requirements and Challenges [J].
Smadi, Abdallah A. ;
Ajao, Babatunde Tobi ;
Johnson, Brian K. ;
Lei, Hangtian ;
Chakhchoukh, Yacine ;
Abu Al-Haija, Qasem .
ELECTRONICS, 2021, 10 (09)
[69]  
Soldatos J., 2020, 360 DEGREE VIEW IOT
[70]   Cyberattack and Fraud Detection Using Ensemble Stacking [J].
Soleymanzadeh, Raha ;
Aljasim, Mustafa ;
Qadeer, Muhammad Waseem ;
Kashef, Rasha .
AI, 2022, 3 (01) :22-36