A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data

被引：155

作者：

Agarap, Abien Fred M. ^{[1
]}

机构：

[1] Adamson Univ, Dept Comp Sci, 900 San Marcelino St, Manila 1000, Philippines

来源：

PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018) | 2018年

关键词：

artificial intelligence; artificial neural networks; gated recurrent units; intrusion detection; machine learning; recurrent neural networks; support vector machine;

D O I：

10.1145/3195106.3195117

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Gated Recurrent Unit (GRU) is a recently-developed variation of the long short-term memory (LSTM) unit, both of which are variants of recurrent neural network (RNN). Through empirical evidence, both models have been proven to be effective in a wide variety of machine learning tasks such as natural language processing, speech recognition, and text classification. Conventionally, like most neural networks, both of the aforementioned RNN variants employ the Softmax function as its final output layer for its prediction, and the cross-entropy function for computing its loss. In this paper, we present an amendment to this norm by introducing linear support vector machine (SVM) as the replacement for Softmax in the final output layer of a GRU model. Furthermore, the cross-entropy function shall be replaced with a margin-based function. While there have been similar studies, this proposal is primarily intended for binary classification on intrusion detection using the 2013 network traffic data from the honeypot systems of Kyoto University. Results show that the GRU-SVM model performs relatively higher than the conventional GRU-Softmax model. The proposed model reached a trainin g accuracy of approximate to 81.54% and a testing accuracy of approximate to 84.15%, while the latter was able to reach a train in g accuracy of approximate to 63.07% and a testing accuracy of approximate to 70.75%. In addition, the juxtaposition of these two final output layers indicate that the SVM would outperform Softmax in prediction time-a theoretical implication which was supported by the actual training and testing time in the study.

引用

页码：26 / 30

页数：5

共 24 条

[1]

Abadi M., 2015, TensorFlow: Large-scale machine learning on heterogeneous systems.

[2]

Alalshekmubarak A., 2013, 2013 9th International Conference on Innovations in Information Technology (IIT), P42

[3]

[Anonymous], Cs231n: Convolutional Neural Networks for Visual Recognition

[4]

[Anonymous], BINARY CLASSIFICATIO

[5]

[Anonymous], 1999 DARPA INTR DET

[6]

[Anonymous], 2013, arXiv

[7]

[Anonymous], 2014, INT C MACH LEARN ICM

[8]

[Anonymous], CYB WILL COST BUS 2

[9]

[Anonymous], ARXIV150801745

[10]

[Anonymous], COST BASED MODELING

← 1 2 3 →