Internal Evaluation of Unsupervised Outlier Detection

被引:21
|
作者
Marques, Henrique O. [1 ]
Campello, Ricardo J. G. B. [2 ]
Sander, Jorg [3 ]
Zimek, Arthur [4 ]
机构
[1] Univ Sao Paulo, Inst Math & Comp Sci ICMC, BR-13566590 Sao Carlos, SP, Brazil
[2] Univ Newcastle, Sch Math & Phys Sci MAPS, Univ Dr, Callaghan, NSW 2308, Australia
[3] Univ Alberta, Dept Comp Sci, Edmonton, AB T6G 2E8, Canada
[4] Univ Southern Denmark, Dept Math & Comp Sci IMADA, Campusvej 55, DK-5230 Odense, Denmark
基金
瑞典研究理事会; 加拿大自然科学与工程研究理事会; 巴西圣保罗研究基金会;
关键词
Outlier detection; unsupervised evaluation; validation; DISTANCE-BASED OUTLIERS;
D O I
10.1145/3394053
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although there is a large and growing literature that tackles the unsupervised outlier detection problem, the unsupervised evaluation of outlier detection results is still virtually untouched in the literature. The so-called internal evaluation, based solely on the data and the assessed solutions themselves, is required if one wants to statistically validate (in absolute terms) or just compare (in relative terms) the solutions provided by different algorithms or by different parameterizations of a given algorithm in the absence of labeled data. However, in contrast to unsupervised cluster analysis, where indexes for internal evaluation and validation of clustering solutions have been conceived and shown to be very useful, in the outlier detection domain, this problem has been notably overlooked. Here we discuss this problem and provide a solution for the internal evaluation of outlier detection results. Specifically, we describe an index called Internal, Relative Evaluation of Outlier Solutions (IREOS) that can evaluate and compare different candidate outlier detection solutions. Initially, the index is designed to evaluate binary solutions only, referred to as top-n outlier detection results. We then extend IREOS to the general case of non-binary solutions, consisting of outlier detection scorings. We also statistically adjust IREOS for chance and extensively evaluate it in several experiments involving different collections of synthetic and real datasets.
引用
收藏
页数:42
相关论文
共 50 条
  • [41] Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering
    Thakran, Yogita
    Toshniwal, Durga
    2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2012, : 947 - 952
  • [42] Unsupervised Outlier detection in sensor networks using aggregation tree
    Zhang, Kejia
    Shi, Shengfei
    Gao, Hong
    Li, Jianzhong
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2007, 4632 : 158 - +
  • [43] Robust and Explainable Autoencoders for Unsupervised Time Series Outlier Detection
    Kieu, Tung
    Yang, Bin
    Guo, Chenjuan
    Jensen, Christian S.
    Zhao, Yan
    Huang, Feiteng
    Zheng, Kai
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 3038 - 3050
  • [44] Interpretable Single-dimension Outlier Detection (ISOD): An Unsupervised Outlier Detection Method Based on Quantiles and Skewness Coefficients
    Huang, Yuehua
    Liu, Wenfen
    Li, Song
    Guo, Ying
    Chen, Wen
    APPLIED SCIENCES-BASEL, 2024, 14 (01):
  • [45] Outlier detection and evaluation by network flow
    Liu, Y
    Sprague, AP
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA'04), 2004, : 436 - 442
  • [46] L0-norm Constrained Autoencoders for Unsupervised Outlier Detection
    Ishii, Yoshinao
    Koide, Satoshi
    Hayakawa, Keiichiro
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT II, 2020, 12085 : 674 - 687
  • [47] Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection
    Radovanovic, Milos
    Nanopoulos, Alexandros
    Ivanovic, Mirjana
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (05) : 1369 - 1382
  • [48] Unsupervised outlier detection for time series by entropy and dynamic time warping
    Benkabou, Seif-Eddine
    Benabdeslem, Khalid
    Canitia, Bruno
    KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 54 (02) : 463 - 486
  • [49] Outlier detection and evaluation by network flow
    Liu, Ying
    Sprague, Alan P.
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2008, 33 (2-3) : 237 - 246
  • [50] Partition-Aware Scalable Outlier Detection Using Unsupervised Learning
    Parveen, Pallabi
    Lee, Melissa
    Henslee, Austin
    Dugan, Matt
    Ford, Brad
    2018 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2018, : 186 - 192