A multivariate extreme value theory approach to anomaly clustering and visualization

被引:0
作者
Maël Chiapino
Stephan Clémençon
Vincent Feuillard
Anne Sabourin
机构
[1] LTCI,
[2] Télécom Paris,undefined
[3] Institut polytechnique de Paris,undefined
[4] Airbus Central R&T,undefined
[5] AI Research,undefined
来源
Computational Statistics | 2020年 / 35卷
关键词
Anomaly detection; Clustering; Graph-mining; Latent variable analysis; Mixture modelling; Multivariate extreme value theory; Visualization;
D O I
暂无
中图分类号
学科分类号
摘要
In a wide variety of situations, anomalies in the behaviour of a complex system, whose health is monitored through the observation of a random vector X=(X1,…,Xd)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X }=(X_1,\; \ldots ,\; X_d)$$\end{document} valued in Rd\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^d$$\end{document}, correspond to the simultaneous occurrence of extreme values for certain subgroups α⊂{1,…,d}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha \subset \{1,\; \ldots ,\; d \}$$\end{document} of variables Xj\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X_j$$\end{document}. Under the heavy-tail assumption, which is precisely appropriate for modeling these phenomena, statistical methods relying on multivariate extreme value theory have been developed in the past few years for identifying such events/subgroups. This paper exploits this approach much further by means of a novel mixture model that permits to describe the distribution of extremal observations and where the anomaly type α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document} is viewed as a latent variable. One may then take advantage of the model by assigning to any extreme point a posterior probability for each anomaly type α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document}, defining implicitly a similarity measure between anomalies. It is explained at length how the latter permits to cluster extreme observations and obtain an informative planar representation of anomalies using standard graph-mining tools. The relevance and usefulness of the clustering and 2-d visual display thus designed is illustrated on simulated datasets and on real observations as well, in the aeronautics application domain.
引用
收藏
页码:607 / 628
页数:21
相关论文
共 50 条
  • [21] Anomaly Detection Method of Aircraft System using Multivariate Time Series Clustering and Classification Techniques
    Ben Slimene, Mohamed
    Ouali, Mohamed-Salah
    IFAC PAPERSONLINE, 2022, 55 (10): : 1582 - 1587
  • [22] Clustering-based anomaly detection in multivariate time series data
    Li, Jinbo
    Izakian, Hesam
    Pedrycz, Witold
    Jamal, Iqbal
    APPLIED SOFT COMPUTING, 2021, 100
  • [23] PROBABILISTIC PATIENT MONITORING USING EXTREME VALUE THEORY A Multivariate, Multimodal Methodology for Detecting Patient Deterioration
    Hugueny, Samuel
    Clifton, David A.
    Tarassenko, Lionel
    BIOSIGNALS 2010: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON BIO-INSPIRED SYSTEMS AND SIGNAL PROCESSING, 2010, : 5 - 12
  • [24] A Multivariate Clustering Approach for Infrastructure Failure Predictions
    Luo, Simon
    Chu, Victor W.
    Zhou, Jianlong
    Chen, Fang
    Wong, Raymond K.
    Huang, Weidong
    2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017), 2017, : 274 - 281
  • [25] An extreme value prediction method based on clustering algorithm
    Dai, Baorui
    Xia, Ye
    Li, Qi
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2022, 222
  • [26] Unsupervised Anomaly Detection Approach for Multivariate Time Series
    Zhou, Yuanlin
    Song, Yingxuan
    Qian, Mideng
    2021 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C 2021), 2021, : 229 - 235
  • [27] Visualization and exploration of spatial probability density functions: A clustering based approach
    Bordoloi, UD
    Kao, DL
    Shen, HW
    VISUALIZATION AND DATA ANALYSIS 2004, 2004, 5295 : 57 - 64
  • [29] A novel anomaly detection approach based on clustering and decision-level fusion
    Zhong, Shengwei
    Zhang, Ye
    IMAGING SPECTROMETRY XX, 2015, 9611
  • [30] Visual analysis approach for clustering multivariate spatial data
    Wu, Fei-Ran
    Chen, Hai-Dong
    Huang, Jin
    Chen, Wei
    Ruan Jian Xue Bao/Journal of Software, 2014, 25 : 111 - 118