Unsupervised outlier detection using random subspace and subsampling ensembles of Dirichlet process mixtures

被引:2
作者
Kim, Dongwook [1 ]
Park, Juyeon [2 ]
Chung, Hee Cheol [3 ,4 ]
Jeong, Seonghyun [5 ,6 ]
机构
[1] Nexon, Seongnam Si 13487, Gyeonggi Do, South Korea
[2] Danggeun Market, Seoul 06611, South Korea
[3] Univ N Carolina, Dept Math & Stat, Charlotte, NC 28223 USA
[4] Univ N Carolina, Sch Data Sci, Charlotte, NC 28223 USA
[5] Yonsei Univ, Dept Stat & Data Sci, Seoul 03722, South Korea
[6] Yonsei Univ, Dept Appl Stat, Seoul 03722, South Korea
基金
新加坡国家研究基金会;
关键词
Anomaly detection; Gaussian mixture models; Outlier ensembles; Random projection; Variational inference; ANOMALY DETECTION; PROJECTION; MODEL;
D O I
10.1016/j.patcog.2024.110846
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Probabilistic mixture models are recognized as effective tools for unsupervised outlier detection owing their interpretability and global characteristics. Among these, Dirichlet process mixture models stand strong alternative to conventional finite mixture models for both clustering and outlier detection tasks. finite mixture models, Dirichlet process mixtures are infinite mixture models that automatically determine number of mixture components based on the data. Despite their advantages, the adoption of Dirichlet mixture models for unsupervised outlier detection has been limited by challenges related to computational inefficiency and sensitivity to outliers in the construction of outlier detectors. Additionally, Dirichlet Gaussian mixtures struggle to effectively model non-Gaussian data with discrete or binary features. To these challenges, we propose a novel outlier detection method that utilizes ensembles of Dirichlet Gaussian mixtures. This unsupervised algorithm employs random subspace and subsampling ensembles ensure efficient computation and improve the robustness of the outlier detector. The ensemble approach improves the suitability of the proposed method for detecting outliers in non-Gaussian data. Furthermore, our method uses variational inference for Dirichlet process mixtures, which ensures both efficient and computation. Empirical analyses using benchmark datasets demonstrate that our method outperforms approaches in unsupervised outlier detection.
引用
收藏
页数:14
相关论文
共 29 条
  • [21] Unsupervised Anomaly Detection Process Using LLE and HDBSCAN by Style-GAN as a Feature Extractor
    Taeheon Lee
    Yoonseok Kim
    Youngjoo Hyun
    Jeonghoon Mo
    Youngjun Yoo
    International Journal of Precision Engineering and Manufacturing, 2024, 25 : 51 - 63
  • [22] Statistical Anomaly Detection in Human Dynamics Monitoring Using a Hierarchical Dirichlet Process Hidden Markov Model
    Fuse, Takashi
    Kamiya, Keita
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2017, 18 (11) : 3083 - 3092
  • [23] Unsupervised Anomaly Detection Process Using LLE and HDBSCAN by Style-GAN as a Feature Extractor
    Lee, Taeheon
    Kim, Yoonseok
    Hyun, Youngjoo
    Mo, Jeonghoon
    Yoo, Youngjun
    INTERNATIONAL JOURNAL OF PRECISION ENGINEERING AND MANUFACTURING, 2024, 25 (01) : 51 - 63
  • [24] Monte Carlo methods for Bayesian analysis of survival data using mixtures of Dirichlet process priors
    Doss, H
    Huffer, FW
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2003, 12 (02) : 282 - 307
  • [25] An Efficient and Robust Unsupervised Anomaly Detection Method Using Ensemble Random Projection in Surveillance Videos
    Hu, Jingtao
    Zhu, En
    Wang, Siqi
    Liu, Xinwang
    Guo, Xifeng
    Yin, Jianping
    SENSORS, 2019, 19 (19)
  • [26] A Dirichlet Process Mixture Model for Autonomous Sleep Apnea Detection using Oxygen Saturation Data
    Li, Zhenglin
    Arvaneh, Mahnaz
    Elphick, Heather E.
    Kingshott, Ruth N.
    Mihaylova, Lyudmila S.
    PROCEEDINGS OF 2020 23RD INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2020), 2020, : 622 - 629
  • [27] A Batch-Incremental Process Fault Detection and Diagnosis Using Mixtures of Probablistic PCA
    Nakamura, Thiago
    Lemos, Andre
    2014 IEEE CONFERENCE ON EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS (EAIS), 2014,
  • [28] Quality-Relevant Batch Process Fault Detection Using a Multiway Multi-Subspace CVA Method
    Cao, Yuping
    Hu, Yongping
    Deng, Xiaogang
    Tian, Xuemin
    IEEE ACCESS, 2017, 5 : 23256 - 23265
  • [29] Machine Learning Approach for Malware Detection Using Random Forest Classifier on Process List Data Structure
    Joshi, Santosh
    Upadhyay, Himanshu
    Lagos, Leonel
    Akkipeddi, Naga Suryamitra
    Guerra, Valerie
    2ND INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND DATA MINING (ICISDM 2018), 2018, : 98 - 102