A multivariate extreme value theory approach to anomaly clustering and visualization

被引:0
作者
Maël Chiapino
Stephan Clémençon
Vincent Feuillard
Anne Sabourin
机构
[1] LTCI,
[2] Télécom Paris,undefined
[3] Institut polytechnique de Paris,undefined
[4] Airbus Central R&T,undefined
[5] AI Research,undefined
来源
Computational Statistics | 2020年 / 35卷
关键词
Anomaly detection; Clustering; Graph-mining; Latent variable analysis; Mixture modelling; Multivariate extreme value theory; Visualization;
D O I
暂无
中图分类号
学科分类号
摘要
In a wide variety of situations, anomalies in the behaviour of a complex system, whose health is monitored through the observation of a random vector X=(X1,…,Xd)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X }=(X_1,\; \ldots ,\; X_d)$$\end{document} valued in Rd\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^d$$\end{document}, correspond to the simultaneous occurrence of extreme values for certain subgroups α⊂{1,…,d}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha \subset \{1,\; \ldots ,\; d \}$$\end{document} of variables Xj\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X_j$$\end{document}. Under the heavy-tail assumption, which is precisely appropriate for modeling these phenomena, statistical methods relying on multivariate extreme value theory have been developed in the past few years for identifying such events/subgroups. This paper exploits this approach much further by means of a novel mixture model that permits to describe the distribution of extremal observations and where the anomaly type α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document} is viewed as a latent variable. One may then take advantage of the model by assigning to any extreme point a posterior probability for each anomaly type α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document}, defining implicitly a similarity measure between anomalies. It is explained at length how the latter permits to cluster extreme observations and obtain an informative planar representation of anomalies using standard graph-mining tools. The relevance and usefulness of the clustering and 2-d visual display thus designed is illustrated on simulated datasets and on real observations as well, in the aeronautics application domain.
引用
收藏
页码:607 / 628
页数:21
相关论文
共 50 条
[41]   Revisiting the hybrid approach of anomaly detection and extreme value theory for estimating pedestrian crashes using traffic conflicts obtained from artificial intelligence-based video analytics [J].
Hussain, Fizza ;
Ali, Yasir ;
Li, Yuefeng ;
Haque, Md Mazharul .
ACCIDENT ANALYSIS AND PREVENTION, 2024, 199
[42]   A Reliable Approach for Lightweight Anomaly Detection in Sensors Using Continuous Wavelet Transform and Vector Clustering [J].
Ahmad, Rami ;
Alhasan, Waseem ;
Wazirali, Raniyah ;
Almajalid, Rania .
IEEE SENSORS JOURNAL, 2024, 24 (15) :24921-24930
[43]   A Clustering Approach for Discovering Intrinsic Clusters in Multivariate Geostatistical Data [J].
Fouedjio, Francky .
MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION (MLDM 2016), 2016, 9729 :491-500
[44]   Practical Approach to Asynchronous Multivariate Time Series Anomaly Detection and Localization [J].
Abdulaal, Ahmed ;
Liu, Zhuanghua ;
Lancewicki, Tomer .
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, :2485-2494
[45]   Automatic alarm setup using extreme value theory [J].
Toshkova, Daniela ;
Asher, Matthew ;
Hutchinson, Paul ;
Lieven, Nicholas .
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2020, 139
[46]   Visualization Assisted Approach to Anomaly and Attack Detection in Water Treatment Systems [J].
Meleshko, Alexey ;
Shulepov, Anton ;
Desnitsky, Vasily ;
Novikova, Evgenia ;
Kotenko, Igor .
WATER, 2022, 14 (15)
[47]   Artificial bee colony algorithm for clustering: an extreme learning approach [J].
Alshamiri, Abobakr Khalil ;
Singh, Alok ;
Surampudi, Bapi .
SOFT COMPUTING, 2016, 20 (08) :3163-3176
[48]   Artificial bee colony algorithm for clustering: an extreme learning approach [J].
Abobakr Khalil Alshamiri ;
Alok Singh ;
Bapi Raju Surampudi .
Soft Computing, 2016, 20 :3163-3176
[49]   Anomaly Detection in Industrial Multivariate Time-Series Data With Neutrosophic Theory [J].
Liu, Peng ;
Han, Qilong ;
Wu, Ting ;
Tao, Wenjian .
IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (15) :13458-13473
[50]   A visual-numeric approach to clustering and anomaly detection for trajectory data [J].
Kumar, Dheeraj ;
Bezdek, James C. ;
Rajasegarar, Sutharshan ;
Leckie, Christopher ;
Palaniswami, Marimuthu .
VISUAL COMPUTER, 2017, 33 (03) :265-281