Packet Preprocessing in CNN-Based Network Intrusion Detection System

被引：35

作者：

Jo, Wooyeon ^{[1
]}

Kim, Sungjin ^{[1
]}

Lee, Changhoon ^{[2
]}

Shon, Taeshik ^{[1
]}

机构：

[1] Ajou Univ, Dept Comp Engn, Suwon 16499, South Korea

[2] SNUT, Dept Comp Engn, Seoul 80523, South Korea

来源：

ELECTRONICS | 2020年 / 9卷 / 07期

基金：

新加坡国家研究基金会;

关键词：

IoT; deep learning; packet preprocessing; intrusion detection system; industrial control system; vehicle; artificial neural networks; data preprocessing; CLASSIFICATION; SECURITY;

D O I：

10.3390/electronics9071151

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The proliferation of various connected platforms, including Internet of things, industrial control systems (ICSs), connected cars, and in-vehicle networks, has resulted in the simultaneous use of multiple protocols and devices. Chaotic situations caused by the usage of different protocols and various types of devices, such as heterogeneous networks, implemented differently by vendors renders the adoption of a flexible security solution difficult, such as recent deep learning-based intrusion detection system (IDS) studies. These studies optimized the deep learning model for their environment to improve performance, but the basic principle of the deep learning model used was not changed, so this can be called a next-generation IDS with a model that has little or no requirements. Some studies proposed IDS based on unsupervised learning technology that does not require labeled data. However, not using available assets, such as network packet data, is a waste of resources. If the security solution considers the role and importance of the devices constituting the network and the security area of the protocol standard by experts, the assets can be well used, but it will no longer be flexible. Most deep learning model-based IDS studies used recurrent neural network (RNN), which is a supervised learning model, because the characteristics of the RNN model, especially when the long-short term memory (LSTM) is incorporated, are better configured to reflect the flow of the packet data stream over time, and thus perform better than other supervised learning models such as convolutional neural network (CNN). However, if the input data induce the CNN's kernel to sufficiently reflect the network characteristics through proper preprocessing, it could perform better than other deep learning models in the network IDS. Hence, we propose the first preprocessing method, called "direct", for network IDS that can use the characteristics of the kernel by using the minimum protocol information, field size, and offset. In addition to direct, we propose two more preprocessing techniques called "weighted" and "compressed". Each requires additional network information; therefore, direct conversion was compared with related studies. Including direct, the proposed preprocessing methods are based on field-to-pixel philosophy, which can reflect the advantages of CNN by extracting the convolutional features of each pixel. Direct is the most intuitive method of applying field-to-pixel conversion to reflect an image's convolutional characteristics in the CNN. Weighted and compressed are conversion methods used to evaluate the direct method. Consequently, the IDS constructed using a CNN with the proposed direct preprocessing method demonstrated meaningful performance in the NSL-KDD dataset.

引用

页码：1 / 15

页数：15

共 33 条

[1] THE D-OMA METHOD FOR MASSIVE MULTIPLE ACCESS IN 6G Performance, Security, and Challenges
Al-Eryani, Yasser
Hossein, Ekram
[J]. IEEE VEHICULAR TECHNOLOGY MAGAZINE, 2019, 14 (03): : 92 - 99
[2] [Anonymous], 2017, DRAGOS TRISIS MALWAR
[3] [Anonymous], 2015, P 9 EAI INT C BIOINS
[4] Chalapathy R., 2019, ARXIV190103407
[5] Correcting design flaws: An improved and cloud assisted key agreement scheme in cyber physical systems
Chaudhry, Shehzad Ashraf
Shon, Taeshik
Al-Turjman, Fadi
Alsharif, Mohammed H.
[J]. COMPUTER COMMUNICATIONS, 2020, 153 : 527 - 537
[6] An improved and robust biometrics-based three factor authentication scheme for multiserver environments
Chaudhry, Shehzad Ashraf
Naqvi, Husnain
Farash, Mohammad Sabzinejad
Shon, Taeshik
Sher, Muhammad
[J]. JOURNAL OF SUPERCOMPUTING, 2018, 74 (08) : 3504 - 3520
[7] Dhanabal L, 2015, Int J Adv Res Comput Commun Eng, V4, P446, DOI DOI 10.17148/IJARCCE.2015.4696
[8] Guo K., 2017, ARXIV171208934
[9] Digital Forensic Practices and Methodologies for AI Speaker Ecosystems
Jo, Wooyeon
Shin, Yeonghun
Kim, Hyungchan
Yoo, Dongkyun
Kim, Donghyun
Kang, Cheulhoon
Jin, Jongmin
Oh, Jungkyung
Na, Bitna
Shon, Taeshik
[J]. DIGITAL INVESTIGATION, 2019, 29 : S80 - S93
[10] Kim M, 2016, ADV CREAT GIFT, P15

← 1 2 3 4 →