Online Influence Forest for Streaming Anomaly Detection

被引:0
作者
Martins, Ines [1 ,2 ]
Resende, Joao S. [3 ,4 ]
Gama, Joao [1 ,2 ]
机构
[1] INESC TEC, Porto, Portugal
[2] Univ Porto, Porto, Portugal
[3] NOVA LINCS, Lisbon, Portugal
[4] Univ Nova Lisboa, Lisbon, Portugal
来源
ADVANCES IN INTELLIGENT DATA ANALYSIS XXI, IDA 2023 | 2023年 / 13876卷
关键词
Streaming data; Online; Incremental; Unsupervised Anomaly detection; Ensemble; Kurtosis; Influence function;
D O I
10.1007/978-3-031-30047-9_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the digital world grows, data is being collected at high speed on a continuous and real-time scale. Hence, the imposed imbalanced and evolving scenario that introduces learning from streaming data remains a challenge. As the research field is still open to consistent strategies that assess continuous and evolving data properties, this paper proposes an unsupervised, online, and incremental anomaly detection ensemble of influence trees that implement adaptive mechanisms to deal with inactive or saturated leaves. This proposal features the fourth standardized moment, also known as kurtosis, as the splitting criteria and the isolation score, Shannon's information content, and the influence function of an instance as the anomaly score. In addition to improving interpretability, this proposal is also evaluated on publicly available datasets, providing a detailed discussion of the results.
引用
收藏
页码:274 / 286
页数:13
相关论文
共 25 条
[1]   A Survey of Predictive Modeling on Im balanced Domains [J].
Branco, Paula ;
Torgo, Luis ;
Ribeiro, Rita P. .
ACM COMPUTING SURVEYS, 2016, 49 (02)
[2]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[3]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[4]  
Ding Z., 2013, IFAC Proc. Volumes, V46, P12, DOI [10.3182/20130902-3-CN-3020.00044, DOI 10.3182/20130902-3-CN-3020.00044]
[5]  
Domingos P., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P71, DOI 10.1145/347090.347107
[6]  
Doshi-Velez F, 2017, Arxiv, DOI [arXiv:1702.08608, 10.48550/arXiv.1702.08608, DOI 10.48550/ARXIV.1702.08608]
[7]  
Fiori AM, 2005, STATISTICA, V65, P135
[8]  
Gomes Heitor Murilo, 2019, ACM SIGKDD Explorations Newsletter, V21, P6, DOI 10.1145/3373464.3373470
[9]  
Guha S, 2016, PR MACH LEARN RES, V48
[10]   INFLUENCE CURVE AND ITS ROLE IN ROBUST ESTIMATION [J].
HAMPEL, FR .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1974, 69 (346) :383-393