Ensemble of Local Decision Trees for Anomaly Detection in Mixed Data

被引:1
作者
Aryal, Sunil [1 ]
Wells, Jonathan R. [1 ]
机构
[1] Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES | 2021年 / 12975卷
关键词
Anomaly detection; Mixed data; LOF; IForest; Ensemble anomaly detection; Decision trees; SUPPORT;
D O I
10.1007/978-3-030-86486-6_42
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Anomaly Detection (AD) is used in many real-world applications such as cybersecurity, banking, and national intelligence. Though many AD algorithms have been proposed in the literature, their effectiveness in practical real-world problems are rather limited. It is mainly because most of them: (i) examine anomalies globally w.r.t. the entire data, but some anomalies exhibit suspicious characteristics w.r.t. their local neighbourhood (local context) only and they appear to be normal in the global context; and (ii) assume that data features are all numeric, but real-world data have numeric/quantitative and categorical/qualitative features. In this paper, we propose a simple robust solution to address the above-mentioned issues. The main idea is to partition the data space and build local models in different regions rather than building a global model for the entire data space. To cover sufficient local context around a test data instance, multiple local models from different partitions (an ensemble of local models) are used. We used classical decision trees that can handle numeric and categorical features well as local models. Our results show that an Ensemble of Local Decision Trees (ELDT) produces better and more consistent detection accuracies compared to popular state-of-the-art AD methods, particularly in datasets with mixed types of features.
引用
收藏
页码:687 / 702
页数:16
相关论文
共 29 条
[1]  
Aggarwal C.C., 2017, OUTLIER ENSEMBLES IN, DOI DOI 10.1007/978-3-319-54765-7
[2]  
[Anonymous], 2012, P 21 ACM INT C INFOR, DOI DOI 10.1145/2396761.2396816
[3]  
[Anonymous], 2005, Comput. Sci. Inf. Syst, DOI DOI 10.2298/CSIS0501103H
[4]  
[Anonymous], 2021, IEEE Trans. Broadcast.
[5]  
[Anonymous], 1973, Pattern Classification and Scene Analysis
[6]  
[Anonymous], 2005, KDD
[7]  
Aryal Sunil, 2016, Intelligence and Security Informatics. 11th Pacific Asia Workshop, PAISI 2016. Proceedings: LNCS 9650, P73, DOI 10.1007/978-3-319-31863-9_6
[8]  
Aryal Sunil, 2014, Advances in Knowledge Discovery and Data Mining. 18th Pacific-Asia Conference, PAKDD 2014. Proceedings: LNCS 8444, P510, DOI 10.1007/978-3-319-06605-9_42
[9]   Anomaly Detection Technique Robust to Units and Scales of Measurement [J].
Aryal, Sunil .
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2018, PT I, 2018, 10937 :589-601
[10]  
Bay S. D., 2003, P 9 ACM SIGKDD INT C, P29