Performance-related Internal Clustering Validation Index for Clustering-based Anomaly Detection

被引：3

作者：

Lee, HyunYong ^{[1
]}

Kim, Nac-Woo ^{[1
]}

Lee, Jun-Gi ^{[1
]}

Lee, Byung-Tak ^{[1
]}

机构：

[1] Elect & Telecommun Res Inst ETRI, Honam Res Ctr HRC, Gwangju, South Korea

来源：

12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION | 2021年

关键词：

Anomaly detection; clustering; validation index; performance; deep learning;

D O I：

10.1109/ICTC52510.2021.9620760

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

One possible way to improve unsupervised anomaly detection is to use per-cluster models, particularly when the given data includes various cluster-level features. In realizing clustering-based anomaly detection, one natural question is how to determine the number of clusters that will likely lead to the optimal performance. In this paper, we propose a method that reflects the performance of anomaly detection in determining the number of clusters. We first derive an internal clustering validation index using the normality scores of trained per-cluster models for unlabeled training data for cases with different numbers of clusters. Then, we determine the number of clusters by selecting the case whose clustering validation index is the highest, which means that per-cluster models extract useful features for anomaly detection. Through experiments, we show that our proposed clustering validation index is highly correlated with anomaly detection accuracy (i.e., the average Pearson correlation coefficient is 0.965).

引用

页码：1036 / 1041

页数：6

共 24 条

[1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2] An extensive comparative study of cluster validity indices
Arbelaitz, Olatz
Gurrutxaga, Ibai
Muguerza, Javier
Perez, Jesus M.
Perona, Inigo
[J]. PATTERN RECOGNITION, 2013, 46 (01) : 243 - 256
[3] Aytekin C, 2018, IEEE IJCNN
[4] Reliability-based fuzzy clustering ensemble
Bagherinia, Ali
Minaei-Bidgoli, Behrooz
Hosseinzadeh, Mehdi
Parvin, Hamid
[J]. FUZZY SETS AND SYSTEMS, 2021, 413 : 1 - 28
[5] Defining quality metrics for graph clustering evaluation
Biswas, Anupam
Biswas, Bhaskar
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 71 : 1 - 17
[6] Deep Clustering for Unsupervised Learning of Visual Features
Caron, Mathilde
Bojanowski, Piotr
Joulin, Armand
Douze, Matthijs
[J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 139 - 156
[7] Chang JL, 2017, IEEE I CONF COMP VIS, P5880, DOI [10.1109/ICCV.2017.626, 10.1109/ICCV.2017.627]
[8] CLUSTER SEPARATION MEASURE
DAVIES, DL
BOULDIN, DW
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) : 224 - 227
[9] Dunn J. C., 1974, Journal of Cybernetics, V4, P95, DOI 10.1080/01969727408546059
[10] A novel approach for fuzzy clustering based on neutrosophic association matrix
Hoang Viet Long
Ali, Mumtaz
Le Hoang Son
Khan, Mohsin
Doan Ngoc Tu
[J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2019, 127 : 687 - 697

← 1 2 3 →