Unsupervised anomaly detection ensembles using item response theory

被引:14
作者
Kandanaarachchi, Sevvandi [1 ]
机构
[1] RMIT Univ, Sch Sci, Math Sci, Melbourne, Vic 3000, Australia
关键词
Anomaly detection ensembles; Outlier detection ensembles; Item Response Theory; Unsupervised learning; Latent trait models; OUTLIER DETECTION;
D O I
10.1016/j.ins.2021.12.042
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Ensemble learning combines many algorithms or models to obtain better predictive perfor-mance. Ensembles have produced the winning algorithm in competitions such as the Netflix Prize. They are used in climate modeling and relied upon to make daily forecasts. Constructing an ensemble from a heterogeneous set of unsupervised anomaly detection methods presents challenges because the class labels or the ground truth is unknown. Thus, traditional ensemble techniques that use the class labels cannot be used for this task. We use Item Response Theory (IRT) - a class of models used in educational psychomet-rics - to construct an unsupervised anomaly detection ensemble. IRT's latent trait compu-tation lends itself to anomaly detection because the latent trait can be used to uncover the hidden ground truth. Using a novel IRT mapping to the anomaly detection problem, we construct an ensemble that can downplay noisy, non-discriminatory methods and accentu-ate sharper methods. We demonstrate the effectiveness of the IRT ensemble using two real data repositories and show that it outperforms other ensemble techniques. We find that the IRT ensemble performs well even if the set of anomaly detection methods have low cor-relation values.(c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:142 / 163
页数:22
相关论文
共 38 条
[1]  
Aggarwal Charu C., 2015, Acm sigkdd explorations newsletter, V17, P24
[2]  
Angiulli F., 2002, Principles of Data Mining and Knowledge Discovery. 6th European Conference, PKDD 2002. Proceedings (Lecture Notes in Artificial Intelligence Vol.2431), P15
[3]  
[Anonymous], 2005, KDD
[4]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[5]   An Unsupervised Boosting Strategy for Outlier Detection Ensembles [J].
Campos, Guilherme O. ;
Zimek, Arthur ;
Meira, Wagner, Jr. .
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2018, PT I, 2018, 10937 :564-576
[6]  
Chen Y., 2020, AISTATS 2019 22 INT
[7]   Item Response Theory Based Ensemble in Machine Learning [J].
Chen, Ziheng ;
Ahn, Hongshik .
INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2020, 17 (05) :621-636
[8]   A study on anomaly detection ensembles [J].
Chiang, Alvin ;
David, Esther ;
Lee, Yuh-Jye ;
Leshem, Guy ;
Yeh, Yi-Ren .
JOURNAL OF APPLIED LOGIC, 2017, 21 :1-13
[9]  
Embretson S. E., 2000, ITEM RESPONSE THEORY
[10]   Clustering by passing messages between data points [J].
Frey, Brendan J. ;
Dueck, Delbert .
SCIENCE, 2007, 315 (5814) :972-976