A multivariate extreme value theory approach to anomaly clustering and visualization

被引:0
作者
Maël Chiapino
Stephan Clémençon
Vincent Feuillard
Anne Sabourin
机构
[1] LTCI,
[2] Télécom Paris,undefined
[3] Institut polytechnique de Paris,undefined
[4] Airbus Central R&T,undefined
[5] AI Research,undefined
来源
Computational Statistics | 2020年 / 35卷
关键词
Anomaly detection; Clustering; Graph-mining; Latent variable analysis; Mixture modelling; Multivariate extreme value theory; Visualization;
D O I
暂无
中图分类号
学科分类号
摘要
In a wide variety of situations, anomalies in the behaviour of a complex system, whose health is monitored through the observation of a random vector X=(X1,…,Xd)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X }=(X_1,\; \ldots ,\; X_d)$$\end{document} valued in Rd\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^d$$\end{document}, correspond to the simultaneous occurrence of extreme values for certain subgroups α⊂{1,…,d}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha \subset \{1,\; \ldots ,\; d \}$$\end{document} of variables Xj\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X_j$$\end{document}. Under the heavy-tail assumption, which is precisely appropriate for modeling these phenomena, statistical methods relying on multivariate extreme value theory have been developed in the past few years for identifying such events/subgroups. This paper exploits this approach much further by means of a novel mixture model that permits to describe the distribution of extremal observations and where the anomaly type α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document} is viewed as a latent variable. One may then take advantage of the model by assigning to any extreme point a posterior probability for each anomaly type α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document}, defining implicitly a similarity measure between anomalies. It is explained at length how the latter permits to cluster extreme observations and obtain an informative planar representation of anomalies using standard graph-mining tools. The relevance and usefulness of the clustering and 2-d visual display thus designed is illustrated on simulated datasets and on real observations as well, in the aeronautics application domain.
引用
收藏
页码:607 / 628
页数:21
相关论文
共 50 条
[1]   A multivariate extreme value theory approach to anomaly clustering and visualization [J].
Chiapino, Mael ;
Clemencon, Stephan ;
Feuillard, Vincent ;
Sabourin, Anne .
COMPUTATIONAL STATISTICS, 2020, 35 (02) :607-628
[2]   Robust Anomaly Detection for Multivariate Data of Spacecraft Through Recurrent Neural Networks and Extreme Value Theory [J].
Xiang, Gang ;
Lin, Ruishi .
IEEE ACCESS, 2021, 9 :167447-167457
[3]   Extreme value theory for anomaly detection - the GPD classifier [J].
Vignotto, Edoardo ;
Engelke, Sebastian .
EXTREMES, 2020, 23 (04) :501-520
[4]   Extreme value theory for anomaly detection – the GPD classifier [J].
Edoardo Vignotto ;
Sebastian Engelke .
Extremes, 2020, 23 :501-520
[5]   Clustering by the Probability Distributions From Extreme Value Theory [J].
Zheng S. ;
Fan K. ;
Hou Y. ;
Feng J. ;
Fu Y. .
IEEE Transactions on Artificial Intelligence, 2023, 4 (02) :292-303
[6]   Estimation of extreme wind pressure coefficient in a zone by multivariate extreme value theory [J].
Yang, Qingshan ;
Li, Danyu ;
Hui, Yi ;
Law, Siu-Seong .
WIND AND STRUCTURES, 2020, 31 (03) :197-207
[7]   Multivariate Extreme Value Theory - A Tutorial with Applications to Hydrology and Meteorology [J].
Dutfoy, Anne ;
Parey, Sylvie ;
Roche, Nicolas .
DEPENDENCE MODELING, 2014, 2 (01) :30-48
[8]   Multivariate Anomaly Detection with Domain Clustering [J].
Boesel, Frederic ;
Schlapfer, Livio ;
Pozidis, Haris ;
Gusat, Mitch .
PROCEEDINGS OF THE 2023 ACM SYMPOSIUM ON CLOUD COMPUTING, SOCC 2023, 2023, :193-199
[9]   Agglomerative Clustering with Threshold Optimization via Extreme Value Theory [J].
Li, Chunchun ;
Guenther, Manuel ;
Dhamija, Akshay Raj ;
Cruz, Steve ;
Jafarzadeh, Mohsen ;
Ahmad, Touqeer ;
Boult, Terrance E. .
ALGORITHMS, 2022, 15 (05)
[10]   An offspring of multivariate extreme value theory: The max-characteristic function [J].
Falk, Michael ;
Stupfler, Gilles .
JOURNAL OF MULTIVARIATE ANALYSIS, 2017, 154 :85-95