A multivariate extreme value theory approach to anomaly clustering and visualization

被引:0
作者
Maël Chiapino
Stephan Clémençon
Vincent Feuillard
Anne Sabourin
机构
[1] LTCI,
[2] Télécom Paris,undefined
[3] Institut polytechnique de Paris,undefined
[4] Airbus Central R&T,undefined
[5] AI Research,undefined
来源
Computational Statistics | 2020年 / 35卷
关键词
Anomaly detection; Clustering; Graph-mining; Latent variable analysis; Mixture modelling; Multivariate extreme value theory; Visualization;
D O I
暂无
中图分类号
学科分类号
摘要
In a wide variety of situations, anomalies in the behaviour of a complex system, whose health is monitored through the observation of a random vector X=(X1,…,Xd)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X }=(X_1,\; \ldots ,\; X_d)$$\end{document} valued in Rd\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^d$$\end{document}, correspond to the simultaneous occurrence of extreme values for certain subgroups α⊂{1,…,d}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha \subset \{1,\; \ldots ,\; d \}$$\end{document} of variables Xj\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X_j$$\end{document}. Under the heavy-tail assumption, which is precisely appropriate for modeling these phenomena, statistical methods relying on multivariate extreme value theory have been developed in the past few years for identifying such events/subgroups. This paper exploits this approach much further by means of a novel mixture model that permits to describe the distribution of extremal observations and where the anomaly type α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document} is viewed as a latent variable. One may then take advantage of the model by assigning to any extreme point a posterior probability for each anomaly type α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document}, defining implicitly a similarity measure between anomalies. It is explained at length how the latter permits to cluster extreme observations and obtain an informative planar representation of anomalies using standard graph-mining tools. The relevance and usefulness of the clustering and 2-d visual display thus designed is illustrated on simulated datasets and on real observations as well, in the aeronautics application domain.
引用
收藏
页码:607 / 628
页数:21
相关论文
共 50 条
[21]   Anomaly Detection Method of Aircraft System using Multivariate Time Series Clustering and Classification Techniques [J].
Ben Slimene, Mohamed ;
Ouali, Mohamed-Salah .
IFAC PAPERSONLINE, 2022, 55 (10) :1582-1587
[22]   Clustering-based anomaly detection in multivariate time series data [J].
Li, Jinbo ;
Izakian, Hesam ;
Pedrycz, Witold ;
Jamal, Iqbal .
APPLIED SOFT COMPUTING, 2021, 100
[23]   PROBABILISTIC PATIENT MONITORING USING EXTREME VALUE THEORY A Multivariate, Multimodal Methodology for Detecting Patient Deterioration [J].
Hugueny, Samuel ;
Clifton, David A. ;
Tarassenko, Lionel .
BIOSIGNALS 2010: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON BIO-INSPIRED SYSTEMS AND SIGNAL PROCESSING, 2010, :5-12
[24]   A Multivariate Clustering Approach for Infrastructure Failure Predictions [J].
Luo, Simon ;
Chu, Victor W. ;
Zhou, Jianlong ;
Chen, Fang ;
Wong, Raymond K. ;
Huang, Weidong .
2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017), 2017, :274-281
[25]   An extreme value prediction method based on clustering algorithm [J].
Dai, Baorui ;
Xia, Ye ;
Li, Qi .
RELIABILITY ENGINEERING & SYSTEM SAFETY, 2022, 222
[26]   Unsupervised Anomaly Detection Approach for Multivariate Time Series [J].
Zhou, Yuanlin ;
Song, Yingxuan ;
Qian, Mideng .
2021 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C 2021), 2021, :229-235
[27]   Visualization and exploration of spatial probability density functions: A clustering based approach [J].
Bordoloi, UD ;
Kao, DL ;
Shen, HW .
VISUALIZATION AND DATA ANALYSIS 2004, 2004, 5295 :57-64
[29]   A novel anomaly detection approach based on clustering and decision-level fusion [J].
Zhong, Shengwei ;
Zhang, Ye .
IMAGING SPECTROMETRY XX, 2015, 9611
[30]   Automated Anomaly Detection in CPS Log Files A Time Series Clustering Approach [J].
Schmidt, Tabea ;
Hauer, Florian ;
Pretschner, Alexander .
COMPUTER SAFETY, RELIABILITY, AND SECURITY, SAFECOMP 2020, 2020, 12234 :179-194