A mathematical assessment of the isolation random forest method for anomaly detection in big data

被引:0
|
作者
Morales, Fernando A. [1 ,3 ]
Ramirez, Jorge M. [1 ,2 ]
Ramos, Edgar A. [1 ]
机构
[1] Univ Nacl Colombia, Escuela Matemat, Antioquia, Colombia
[2] Oak Ridge Natl Lab, Comp Sci & Math, Oak Ridge, TN USA
[3] Carrera 65 59A 110,43-106, Medellin, Colombia
关键词
anomaly detection; isolation random forest; monte carlo methods; probabilistic algorithms;
D O I
10.1002/mma.8570
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We present the mathematical analysis of the Isolation Random Forest Method (IRF Method) for anomaly detection, proposed by Liu F.T., Ting K.M. and Zhou Z. H. in their seminal work as a heuristic method for anomaly detection in Big Data. We prove that the IRF space can be endowed with a probability induced by the Isolation Tree algorithm (iTree). In this setting, the convergence of the IRF method is proved, using the Law of Large Numbers. A couple of counterexamples are presented to show that the method is inconclusive and no certificate of quality can be given, when using it as a means to detect anomalies. Hence, an alternative version of the method is proposed whose mathematical foundation is fully justified. Furthermore, a criterion for choosing the number of sampled trees needed to guarantee confidence intervals of the numerical results is presented. Finally, numerical experiments are presented to compare the performance of the classic method with the proposed one.
引用
收藏
页码:1156 / 1177
页数:22
相关论文
共 50 条
  • [41] Anomaly Detection using Random Forest: A Performance Revisited
    Primartha, Rifkie
    Tama, Bayu Adhi
    PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE), 2017,
  • [42] Anomaly Detection in Semiconductor Cleanroom Using Isolation Forest
    Jahan, Israt
    Alam, Md Morshed
    Ahmed, Md Faisal
    Jang, Yeong Min
    12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 795 - 797
  • [43] Anomaly Detection for Big Data Security: A Benchmark
    Es-Samaali, Hamza H.
    Outchakoucht, Aissam A.
    Benhadou, Siham S.
    Mounnan, Oussama O.
    Abou El Kalam, Anas A.
    2021 THE 3RD INTERNATIONAL CONFERENCE ON BIG DATA ENGINEERING AND TECHNOLOGY, BDET 2021, 2021, : 35 - 39
  • [44] Isolation Mondrian Forest for Batch and Online Anomaly Detection
    Ma, Haoran
    Ghojogh, Benyamin
    Samad, Maria N.
    Zheng, Dongyu
    Crowley, Mark
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3051 - 3058
  • [45] Subspace analysis isolation forest for hyperspectral anomaly detection
    Huang Y.
    Xue Y.
    Li P.
    Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2021, 50 (03): : 416 - 425
  • [46] CADI: Contextual Anomaly Detection using an Isolation Forest
    Yepmo, Veronne
    Smits, Gregory
    Lesot, Marie-Jeanne
    Pivert, Olivier
    39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 935 - 944
  • [47] Semi-Supervised Isolation Forest for Anomaly Detection
    Stradiotti, Luca
    Perini, Lorenzo
    Davis, Jesse
    PROCEEDINGS OF THE 2024 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2024, : 670 - 678
  • [48] Big Data Analytics for Anomaly Detection in Blockchain
    Ozbilen, Mahmut Lutfullah
    Ozcan, Elif
    Keles, Mustafa Berk
    Zeybel, Merve
    Dervisoglu, Havanur
    Dogan, Aslinur
    Haklidir, Mehmet
    2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
  • [49] A method for WAMS big data modeling and abnormal data detection with large random matrices
    Wei, Daqian
    Wang, Bo
    Liu, Dichen
    Luo, Jinhao
    Ji, Xingpei
    Zhongguo Dianji Gongcheng Xuebao/Proceedings of the Chinese Society of Electrical Engineering, 2015, 35 : 59 - 66
  • [50] Big Data Anomaly Prediction Algorithm of Smart City Power Internet of Things Based on Parallel Random Forest
    Zheng, Sida
    Cheng, Jie
    Xiong, Hongzhang
    Wang, Yanjin
    Wang, Yuning
    JOURNAL OF TESTING AND EVALUATION, 2024, 52 (03) : 1429 - 1442