Variational autoencoder-based outlier detection for high-dimensional data

被引:9
作者
Li, Yongmou [1 ,2 ]
Wang, Yijie [1 ,2 ]
Ma, Xingkong [2 ]
机构
[1] Natl Univ Def Technol, Natl Lab Parallel & Distributed Proc, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China
基金
国家教育部科学基金资助; 中国国家自然科学基金;
关键词
Variational autoencoders; outlier detection; high-dimensional data;
D O I
10.3233/IDA-184240
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Analysis of high-dimensional data often suffers from the curse of dimensionality and the complicated correlation among dimensions. Dimension reduction methods often are used to alleviate these problems. Existing outlier detection methods based on dimension reduction usually only rely on reconstruction error to detect outlier or apply conventional outlier detection methods to the reduced data, which could deteriorate the performance of outlier detection as only considering part of the information from data. Few studies have been done to combine these two strategies to do outlier detection. In this paper, we proposed an outlier detection method based on Variational Autoencoder (VAE), which combines low-dimensional representation and reconstruction error to detect outliers. Specifically, we first model the data use VAE, then extract four outlier scores from VAE model, finally propose an ensemble method to combine the four outlier scores. The experiments conducted on six real-world datasets show that the proposed method performs better than or at least comparable to state of the art methods.
引用
收藏
页码:991 / 1002
页数:12
相关论文
共 50 条
  • [1] Autoencoder-based outlier detection for sparse, high dimensional data
    Chen, Wanghu
    Li, Huijun
    Li, Jing
    Arshad, Ali
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 2735 - 2742
  • [2] Efficient Outlier Detection for High-Dimensional Data
    Liu, Huawen
    Li, Xuelong
    Li, Jiuyong
    Zhang, Shichao
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2018, 48 (12): : 2451 - 2461
  • [3] Research on Outlier Detection for High-Dimensional Data Based on PPCLOF
    Chen, Chen
    Luo, Kaiwen
    Min, Lan
    Li, Shenglin
    JOURNAL OF WEB ENGINEERING, 2021, 20 (03): : 743 - 758
  • [4] Graph autoencoder-based unsupervised outlier detection
    Du, Xusheng
    Yu, Jiong
    Chu, Zheng
    Jin, Lina
    Chen, Jiaying
    INFORMATION SCIENCES, 2022, 608 : 532 - 550
  • [5] A Comparison of Outlier Detection Techniques for High-Dimensional Data
    Xu, Xiaodan
    Liu, Huawen
    Li, Li
    Yao, Minghai
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2018, 11 (01) : 652 - 662
  • [6] A Comparison of Outlier Detection Techniques for High-Dimensional Data
    Xiaodan Xu
    Huawen Liu
    Li Li
    Minghai Yao
    International Journal of Computational Intelligence Systems, 2018, 11 : 652 - 662
  • [7] Contextual anomaly detection for high-dimensional data using Dirichlet process variational autoencoder
    Kim, Hyojoong
    Kim, Heeyoung
    IISE TRANSACTIONS, 2023, 55 (05) : 433 - 444
  • [8] A Unified Unsupervised Gaussian Mixture Variational Autoencoder for High Dimensional Outlier Detection
    Liao, Weixian
    Guo, Yifan
    Chen, Xuhui
    Li, Pan
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 1208 - 1217
  • [9] OUTLIER DETECTION BASED ON DENSITY OF HYPERCUBE IN HIGH-DIMENSIONAL DATA STREAM
    Shou, Zhaoyu
    Zou, Fengbo
    Li, Simin
    Lu, Xianying
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2019, 15 (03): : 873 - 889
  • [10] A geometric framework for outlier detection in high-dimensional data
    Herrmann, Moritz
    Pfisterer, Florian
    Scheipl, Fabian
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 13 (03)