Fast Activation Function Approach for Deep Learning Based Online Anomaly Intrusion Detection

被引:21
作者
Alrawashdeh, Khaled [1 ]
Purdy, Carla [1 ]
机构
[1] Univ Cincinnati, Dept Elect Engn & Comp Syst, Cincinnati, OH 45221 USA
来源
2018 IEEE 4TH INTERNATIONAL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY), 4THIEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, (HPSC) AND 3RD IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS) | 2018年
关键词
Deep Learning; Neural Network; Intrusion Detection; Anomaly Detection; ReLU; DBN; RBM; AutoEncoder; ELM; Cyber Security;
D O I
10.1109/BDS/HPSC/IDS18.2018.00016
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The piecewise-linear activation functions such as ReLU become the catalyst that revolutionizes the training of the deep neural networks. Common nonlinear activation functions used in neural networks such as the tanh and the sigmoid activation functions suffer from saturation during training. The saturation behavior causes the problem of vanishing stochastic gradient decent. We propose a fast activation function, namely the Adaptive Linear Function (ALF) to increase the convergence speed and accuracy of the deep leaning structure for real-time applications. The ALF reduces the saturation effects caused by the soft activation functions and the vanishing gradient caused by the negative values of the ReLU. We evaluate the training method for an online anomaly intrusion detection system using Deep Belief Network (DBN) and simulating four benchmark datasets. The activation function increases the convergence speed of the DBN, with the entire training time reduced 80% compared to the sigmoid, ReLU, and tanh activation functions. The method achieves an accuracy rate of 98.59% on the total 10% KDDCUP'99 test dataset, 96.2% on the NSL-KDD dataset, 98.4% on the Kyoto dataset, and 96.57% on the CSIC HTTP dataset. The proposed activation function outperformed the results obtained when any of the three activation functions-sigmoid, ReLu, or tanh-was used on the test stream of the four datasets. Furthermore, the DBN structure outperforms state-of-the-art networks such as the Stacked Sparse AutoEncoder Based Extreme Learning Machine (SSAELM) in both accuracy and convergence speed.
引用
收藏
页码:5 / 13
页数:9
相关论文
共 28 条
[1]  
Alrawashdeh K, 2017, PROC NAECON IEEE NAT, P57, DOI 10.1109/NAECON.2017.8268745
[2]  
Alrawashdeh K, 2016, 2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), P195, DOI [10.1109/ICMLA.2016.167, 10.1109/ICMLA.2016.0040]
[3]  
[Anonymous], 2015, DEEP LEARNING PYTHON
[4]  
[Anonymous], 2017, INTERNATIONAL CONFER
[5]  
[Anonymous], 2017, TRAFFIC DATA KYOTO U
[6]  
[Anonymous], 2017, ARXIV E PRINTS
[7]  
[Anonymous], 2015, METHODS
[8]  
[Anonymous], INT C INF COMP TECHN
[9]  
[Anonymous], 2015, NEURAL PROCESSING LE
[10]  
[Anonymous], P WORLD MULT SYST CY