Isolation Distributional Kernel: A New Tool for Point and Group Anomaly Detections

被引:13
作者
Ting, Kai Ming [1 ]
Xu, Bi-Cun [1 ]
Washio, Takashi [2 ]
Zhou, Zhi-Hua [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
[2] Osaka Univ, Inst Sci & Ind Res, Suita, Osaka 5650871, Japan
关键词
Kernel; Anomaly detection; Detectors; Feature extraction; Time complexity; Task analysis; Hilbert space; Distributional kernel; kernel mean embedding; anomaly detection;
D O I
10.1109/TKDE.2021.3120277
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce Isolation Distributional Kernel as a new way to measure the similarity between two distributions. Existing approaches based on kernel mean embedding, which convert a point kernel to a distributional kernel, have two key issues: the point kernel employed has a feature map with intractable dimensionality; and it is data independent. This paper shows that Isolation Distributional Kernel (IDK), which is based on a data dependent point kernel, addresses both key issues. We demonstrate IDK's efficacy and efficiency as a new tool for kernel-based anomaly detection for both point and group anomalies. Without explicit learning, using IDK alone outperforms existing kernel-based point anomaly detector OCSVM and other kernel mean embedding methods that rely on Gaussian kernel. For group anomaly detection, we introduce an IDK based detector called IDK2. It reformulates the problem of group anomaly detection in input space into the problem of point anomaly detection in Hilbert Space, without the need for learning. IDK2 runs orders of magnitude faster than group anomaly detector OCSMM. We reveal for the first time that an effective kernel-based anomaly detector based on kernel mean embedding must employ a characteristic kernel which is data dependent.
引用
收藏
页码:2697 / 2710
页数:14
相关论文
共 34 条
  • [1] Aggarwal C.C., 2017, Outlier Ensembles: An Introduction
  • [2] [Anonymous], 2009, Advances in neural information processing systems
  • [3] Isolation-based anomaly detection using nearest-neighbor ensembles
    Bandaragoda, Tharindu R.
    Ting, Kai Ming
    Albrecht, David
    Liu, Fei Tony
    Zhu, Ye
    Wells, Jonathan R.
    [J]. COMPUTATIONAL INTELLIGENCE, 2018, 34 (04) : 968 - 998
  • [4] Random-close packing limits for monodisperse and polydisperse hard spheres
    Baranau, Vasili
    Tallarek, Ulrich
    [J]. SOFT MATTER, 2014, 10 (21) : 3826 - 3841
  • [5] Group Anomaly Detection Using Deep Generative Models
    Chalapathy, Raghavendra
    Toth, Edward
    Chawla, Sanjay
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT I, 2019, 11051 : 173 - 189
  • [6] Demsar J, 2006, J MACH LEARN RES, V7, P1
  • [7] Emmott A, 2016, Arxiv, DOI arXiv:1503.01158
  • [8] Anomaly Detection Using Local Kernel Density Estimation and Context-Based Regression
    Hu, Weiming
    Gao, Jun
    Li, Bing
    Wu, Ou
    Du, Junping
    Maybank, Stephen
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (02) : 218 - 233
  • [9] Keinosuke F, 1990, INTRO STAT PATTERN R, P268
  • [10] Isolation Forest
    Liu, Fei Tony
    Ting, Kai Ming
    Zhou, Zhi-Hua
    [J]. ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 413 - +