Research on unsupervised anomaly data detection method based on improved automatic encoder and Gaussian mixture model

被引：0

作者：

Liu, Xiangyu ^{[1
]}

Zhu, Shibing ^{[1
]}

Yang, Fan ^{[1
]}

Liang, Shengjun ^{[2
]}

机构：

[1] Univ Space Engn, Beijing 101416, Peoples R China

[2] Beijing Informat Sci & Technol Univ, Beijing 100026, Peoples R China

来源：

JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS | 2022年 / 11卷 / 01期

关键词：

Cloud security; Unsupervised machine learning; Anomalous data detection; Memory module; Deep autoencoder; Gaussian mixture model; SUPPORT VECTOR MACHINE;

D O I：

10.1186/s13677-022-00328-z

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the development of cloud computing, more and more security problems like "fuzzy boundary" are exposed. To solve such problems, unsupervised anomaly detection is increasingly used in cloud security, where density estimation is commonly used in anomaly detection clustering tasks. However, in practical use, the excessive amount of data and high dimensionality of data features can lead to difficulties in data calibration, data redundancy, and reduced effectiveness of density estimation algorithms. Although auto-encoders have made fruitful progress in data dimensionality reduction, using auto-encoders alone may still cause the model to be too generalized and unable to detect specific anomalies. In this paper, a new unsupervised anomaly detection method, MemAe-gmm-ma, is proposed. MemAe-gmm-ma generates a low-dimensional representation and reconstruction error for each input sample by a deep auto-encoder. It adds a memory module inside the auto-encoder to better learn the inner meaning of the training samples, and finally puts the low-dimensional information of the samples into a Gaussian mixture model (GMM) for density estimation. MemAe-gmm-ma demonstrates better performance on the public benchmark dataset, with a 4.47% improvement over the MemAe model standard F1 score on the NSL-KDD dataset, and a 9.77% improvement over the CAE-GMM model standard F1 score on the CIC-IDS-2017 dataset.

引用

页数：16

共 44 条

[1] Multi-level hybrid support vector machine and extreme learning machine based on modified K-means for intrusion detection system
Al-Yaseen, Wathiq Laftah
Othman, Zulaiha Ali
Nazri, Mohd Zakree Ahmad
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 67 : 296 - 303
[2] Bengio Y., 2007, ADV NEURAL INFORM PR, P153
[3] Chen YQ, 2001, IEEE IMAGE PROC, P34, DOI 10.1109/ICIP.2001.958946
[4] [陈庄 Chen Zhuang], 2014, [计算机科学, Computer Science], V41, P178
[5] Unsupervised learning approach for network intrusion detection system using autoencoders
Choi, Hyunseung
Kim, Mintae
Lee, Gyubok
Kim, Wooju
[J]. JOURNAL OF SUPERCOMPUTING, 2019, 75 (09) : 5597 - 5621
[6] Hoang DH, 2018, INT CONF ADV COMMUN, P381, DOI 10.23919/ICACT.2018.8323766
[7] Eltaeib Tarik, 2021, 2021 8th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2021 7th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom), P42, DOI 10.1109/CSCloud-EdgeCom52276.2021.00018
[8] Memorizing Normality to Detect Anomaly: Memory-augmented Deep Autoencoder for Unsupervised Anomaly Detection
Gong, Dong
Liu, Lingqiao
Le, Vuong
Saha, Budhaditya
Mansour, Moussa Reda
Venkatesh, Svetha
van den Hengel, Anton
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1705 - 1714
[9] Graves A, 2014, Arxiv, DOI arXiv:1410.5401
[10] Hu N, 2020, CYBERSPACE SECUR, V11, P40

← 1 2 3 4 5 →