A Triplet Deviation Network Framework: Boosting Weakly-supervised Anomaly Detection By Ensemble Learning

被引:0
作者
Ling, Hefei [1 ]
Pan, Shuhui [1 ]
Shi, Yuxuan [1 ]
Li, Qingsong [1 ]
Li, Ping [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
来源
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2022年
关键词
Anomaly detection; Data mining; Weakly-supervised; ensemble learning;
D O I
10.1109/IJCNN55064.2022.9892290
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly-supervised anomaly detection refers to extracting the information to identify anomalies from the limited labeled anomalies. Existing work has used the deviation network to learn the anomaly score end-to-end, which treats unlabeled data as normal data, and makes the output score of labeled anomalies deviate greatly from the normal data. However, due to the diversity of anomaly types, the model trained by limited labeled anomaly is one-sided, and can not be generalized to identify other types of anomalies. Simply treating unlabeled data as normal data can not extract features from unlabeled data to improve the generalization and accuracy of the model. In this paper, a triplet deviation network framework(TDNF) is proposed. Compared with the original deviation network, it adds a potential anomalies filtering module and a prior anomaly score generation module. The potential anomalies filtering module ensemble multiple unsupervised methods to evaluate data and filter potential anomalies. Labeled anomalies, potential anomalies, unlabeled data compose multiple triplets, and input to deviation network to improve the ability of the model to identify different types of anomalies. The prior anomaly score generation module uses one unsupervised method to generate normalized prior anomaly scores. The prior anomaly scores as prior knowledge of deviation network, which help to fine-tune the model's learning of unlabeled data to optimize the model's ability of anomaly ranking for unlabeled data. We give a triplet deviation network instance TDN-IHSC and carry out extensive experiments on multiple real-world datasets. The results show that our method is effective and performs better than the other four advanced competitive methods.
引用
收藏
页数:8
相关论文
共 35 条
[1]  
[Anonymous], 2018, JOINT EUR C MACH LEA
[2]  
[Anonymous], 2023, 2009 IEEE S COMP INT
[3]  
Bache K., 2013, UCI machine learning repository
[4]   Outlier Resistant Unsupervised Deep Architectures for Attributed Network Embedding [J].
Bandyopadhyay, Sambaran ;
Lokesh, N. ;
Vivek, Saley Vishal ;
Murty, M. N. .
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, :25-33
[5]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[6]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[7]  
Fan HY, 2020, INT CONF ACOUST SPEE, P5685, DOI [10.1109/icassp40776.2020.9053387, 10.1109/ICASSP40776.2020.9053387]
[8]  
Goldstein S, 2012, FRONT COLLECT, P59, DOI 10.1007/978-3-642-21329-8_4
[9]  
Li Z., 2022, P AAAI C ART INT
[10]   COPOD: Copula-Based Outlier Detection [J].
Li, Zheng ;
Zhao, Yue ;
Botta, Nicola ;
Ionescu, Cezar ;
Hu, Xiyang .
20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2020), 2020, :1118-1123