Model Uncertainty for Annotation Error Correction in Deep Learning Based Intrusion Detection System

被引：2

作者：

Chen, Wencheng ^{[1
]}

Li, Hongyu ^{[1
]}

Zeng, Yi ^{[1
,2
]}

Ren, Zichang ^{[1
]}

Zheng, Xingxin ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

[2] Univ Calif San Diego, San Diego, CA 92103 USA

来源：

4TH IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2019) / 3RD INTERNATIONAL SYMPOSIUM ON REINFORCEMENT LEARNING (ISRL 2019) | 2019年

关键词：

Cyber Security; Intrusion Detection; Deep Learning; Model Uncertainty; Annotation Errors; NEURAL-NETWORKS;

D O I：

10.1109/SmartCloud.2019.00033

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Accurate network traffic classification is of urgent need in the big data era, as the anomalous network traffic becomes formidable to classify in the nowadays complicated network environment. Deep Learning (DL) techniques can master in detecting anomalous data due to the capability of fitting training data. However, this capability lay on the correctness of the training data, which also made them sensitive to annotation errors. We propose that by measuring the uncertainty of the model, annotation errors can be accurately corrected for classifying network traffic. We use dropout to approximate the prior distribution and calculate Mutual Information (MI) and Softmax Variance (SV) of the output. In this paper, we present a framework named Uncertainty Based Annotation Error Correction(UAEC) based on both MI and SV, whose accuracy outperforms other proposed methods. By modifying the labels of a public dataset, a real-life annotation scenario is simulated. Based on the regenerated dataset, we compare the detection effectiveness of Euclidean Distance, MI, SV, and UAEC. As demonstrated in the experiment, by using UAEC, an averaging 47.92% increase in the detection accuracy is attained.

引用

页码：137 / 142

页数：6

共 25 条

[1]

[Anonymous], 1997, P INT C KNOWL DISC D

[2] Prediction Errors in Learning Drug Response from Gene Expression Data - Influence of Labeling, Sample Size, and Machine Learning Algorithm [J].

Bayer, Immanuel ;

Groth, Philip ;

Schneckener, Sebastian .

PLOS ONE, 2013, 8 (07)

[3] Traffic classification on the fly [J].

Bernaille, Laurent ;

Teixeira, Renata ;

Akodkenou, Ismael ;

Soule, Augustin ;

Salamatian, Kave .

ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2006, 36 (02) :23-26

[4]

BISHOP C. M., 2006, Pattern recognition and machine learning, DOI [DOI 10.1117/1.2819119, 10.1007/978-0-387-45528-0]

[5] Independent comparison of popular DPI tools for traffic classification [J].

Bujlow, Tomasz ;

Carela-Espanol, Valentin ;

Barlet-Ros, Pere .

COMPUTER NETWORKS, 2015, 76 :75-89

[6] Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution [J].

Chen, Yunpeng ;

Fan, Haoqi ;

Xu, Bing ;

Yan, Zhicheng ;

Kalantidis, Yannis ;

Rohrbach, Marcus ;

Yan, Shuicheng ;

Feng, Jiashi .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3434-3443

[7]

Fortunato M., 2017, Bayesian recurrent neural networks

[8]

Gal Y, 2015, Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, DOI DOI 10.48550/ARXIV.1506.02142

[9]

Gal Y., 2016, Uncertainty in Deep Learning, V1, P3

[10]

Gong X., 2019, 6 IEEE INT C CYB SEC

← 1 2 3 →