A mathematical assessment of the isolation random forest method for anomaly detection in big data

被引:0
|
作者
Morales, Fernando A. [1 ,3 ]
Ramirez, Jorge M. [1 ,2 ]
Ramos, Edgar A. [1 ]
机构
[1] Univ Nacl Colombia, Escuela Matemat, Antioquia, Colombia
[2] Oak Ridge Natl Lab, Comp Sci & Math, Oak Ridge, TN USA
[3] Carrera 65 59A 110,43-106, Medellin, Colombia
关键词
anomaly detection; isolation random forest; monte carlo methods; probabilistic algorithms;
D O I
10.1002/mma.8570
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We present the mathematical analysis of the Isolation Random Forest Method (IRF Method) for anomaly detection, proposed by Liu F.T., Ting K.M. and Zhou Z. H. in their seminal work as a heuristic method for anomaly detection in Big Data. We prove that the IRF space can be endowed with a probability induced by the Isolation Tree algorithm (iTree). In this setting, the convergence of the IRF method is proved, using the Law of Large Numbers. A couple of counterexamples are presented to show that the method is inconclusive and no certificate of quality can be given, when using it as a means to detect anomalies. Hence, an alternative version of the method is proposed whose mathematical foundation is fully justified. Furthermore, a criterion for choosing the number of sampled trees needed to guarantee confidence intervals of the numerical results is presented. Finally, numerical experiments are presented to compare the performance of the classic method with the proposed one.
引用
收藏
页码:1156 / 1177
页数:22
相关论文
共 50 条
  • [21] Random Histogram Forest for Unsupervised Anomaly Detection
    Putina, Andrian
    Sozio, Mauro
    Rossi, Dario
    Navarro, Jose M.
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2020), 2020, : 1226 - 1231
  • [22] Simultaneous detection for multiple anomaly data in internet of energy based on random forest
    Li, Qiang
    Zhang, Limei
    Zhang, Guanghui
    Ouyang, Hanyi
    Bai, Muke
    APPLIED SOFT COMPUTING, 2023, 134
  • [23] Hyperspectral Anomaly Detection With Kernel Isolation Forest
    Li, Shutao
    Zhang, Kunzhong
    Duan, Puhong
    Kang, Xudong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (01): : 319 - 329
  • [24] OptIForest: Optimal Isolation Forest for Anomaly Detection
    Xiang, Haolong
    Zhang, Xuyun
    Hu, Hongsheng
    Qi, Lianyong
    Dou, Wanchun
    Dras, Mark
    Beheshti, Amin
    Xu, Xiaolong
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 2379 - 2387
  • [25] Isolation Forest Based Anomaly Detection Framework on Non-IID Data
    Xiang, Haolong
    Wang, Jiayu
    Ramamohanarao, Kotagiri
    Salcic, Zoran
    Dou, Wanchun
    Zhang, Xuyun
    IEEE INTELLIGENT SYSTEMS, 2021, 36 (03) : 31 - 40
  • [26] ISOLATION FOREST FOR ANOMALY DETECTION IN HYPERSPECTRAL IMAGES
    Zhang, Kunzhong
    Kang, Xudong
    Li, Shutao
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 437 - 440
  • [27] A Revised Isolation Forest procedure for Anomaly Detection with High Number of Data Points
    Marcelli, Elisa
    Barbariol, Tommaso
    Savarino, Vincenzo
    Beghi, Alessandro
    Susto, Gian Antonio
    2022 23RD IEEE LATIN-AMERICAN TEST SYMPOSIUM (LATS 2022), 2022,
  • [28] Anomaly Detection Method Based on One-Class Random Forest with Applications
    Zhang X.
    Zhang W.
    Zhou R.
    Xiang Z.
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2020, 54 (02): : 1 - 8and157
  • [29] An Anomaly Detection Method for Wireless Sensor Networks Based on the Improved Isolation Forest
    Chen, Junxiang
    Zhang, Jilin
    Qian, Ruixiang
    Yuan, Junfeng
    Ren, Yongjian
    APPLIED SCIENCES-BASEL, 2023, 13 (02):
  • [30] Magnetic Anomaly Detection Method Based on Feature Fusion and Isolation Forest Algorithm
    Zhang, Ning
    Liu, Yifei
    Xu, Lei
    Lin, Pengfei
    Zhao, Heda
    Chang, Ming
    IEEE ACCESS, 2022, 10 : 84444 - 84457