Imbalanced tabular data modelization using CTGAN and machine learning to improve IoT Botnet attacks detection

被引:81
作者
Habibi, Omar [1 ]
Chemmakha, Mohammed [1 ]
Lazaar, Mohamed [1 ]
机构
[1] Mohammed V Univ Rabat, ENSIAS, Rabat, Morocco
关键词
IoT security; Botnet detection; Machine learning; MLP; Tabular IoT data; CTGAN; DDOS attack; Reconnaissance attack; Adversarial attack; Features privacy; ADVERSARIAL; SELECTION; FEATURES; NETWORKS;
D O I
10.1016/j.engappai.2022.105669
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Internet of Things (IoT) has been trending in the past few years, posing so many security problems. IoT Botnets are one of the most serious attacks that threaten the reliability of IoT systems because of IoT devices resources constraints. Most of the classical and even the intelligent solutions such as ML and DL for botnet detection are trained on unlabeled dataset, in other cases, dataset trustability is not approved but it is used, which may cause degradation of performance in case those models were implemented in security tools in order to deal with zero-day threats. Another two critical problems are the limits of classical oversampling methods while generating samples and the limits of understanding complex datasets and modeling real tabular data by the existing GAN models. The aim of this paper is to implement the CTGAN model, the state-of-the-art of Generative Adversarial Networks models in tabular data modeling and generation in order to overcome all previous mentioned limits. The results are promising. After data augmentation using CTGAN, MLP achieves an accuracy of 98.93%, F1-Score equal to 0.9907, Geometric-Score equal to 0.9874, sensitivity and specificity achieve resp. values of 0.9893 and 0.9856.
引用
收藏
页数:23
相关论文
共 39 条
[1]  
Alshamkhany Mustafa, 2020, 2020 14th International Conference on Innovations in Information Technology (IIT), P203, DOI 10.1109/IIT50501.2020.9299061
[2]  
[Anonymous], GLOBAL IOT MARKET GR
[3]  
[Anonymous], EFFECT IMBALANCED DA
[4]  
[Anonymous], METRICS MEASURING CT
[5]  
[Anonymous], GAUSSIANCOPULA MODEL
[6]  
[Anonymous], 2016, NIPS
[7]   MedGAN: Medical image translation using GANs [J].
Armanious, Karim ;
Jiang, Chenming ;
Fischer, Marc ;
Kuestner, Thomas ;
Nikolaou, Konstantin ;
Gatidis, Sergios ;
Yang, Bin .
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2020, 79
[8]   IoTBoT-IDS: A novel statistical learning-enabled botnet detection framework for protecting networks of smart cities [J].
Ashraf, Javed ;
Keshk, Marwa ;
Moustafa, Nour ;
Abdel-Basset, Mohamed ;
Khurshid, Hasnat ;
Bakhshi, Asim D. ;
Mostafa, Reham R. .
SUSTAINABLE CITIES AND SOCIETY, 2021, 72
[9]   A Review of Tabular Data Synthesis Using GANs on an IDS Dataset [J].
Bourou, Stavroula ;
El Saer, Andreas ;
Velivassaki, Terpsichori-Helen ;
Voulkidis, Artemis ;
Zahariadis, Theodore .
INFORMATION, 2021, 12 (09)
[10]   Improving Machine Learning Models for Malware Detection Using Embedded Feature Selection Method [J].
Chemmakha, Mohammed ;
Habibi, Omar ;
Lazaar, Mohamed .
IFAC PAPERSONLINE, 2022, 55 (12) :771-776