Improving Generalizability of Graph Anomaly Detection Models via Data Augmentation

被引:16
作者
Zhou, Shuang [1 ]
Huang, Xiao [1 ]
Liu, Ninghao [2 ]
Zhou, Huachi [1 ]
Chung, Fu-Lai [1 ]
Huang, Long-Kai [3 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[2] Univ Georgia, Athens, GA 30602 USA
[3] Tencent AI Lab, Shenzhen 518057, Guangdong, Peoples R China
关键词
Graph neural networks; Graph anomaly detection; model generalizability; data augmentation;
D O I
10.1109/TKDE.2023.3271771
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph anomaly detection (GAD) has wide applications in real-world networked systems. In many scenarios, people need to identify anomalies on new (sub)graphs, but they may lack labels to train an effective detection model. Since recent semi-supervised GAD methods, which can leverage the available labels as prior knowledge, have achieved superior performance than unsupervised methods, one natural idea is to directly adopt a trained semi-supervised GAD model to the new (sub)graphs for testing. However, we find that existing semi-supervised GAD methods suffer from poor generalization issues, i.e., well-trained models could not perform well on an unseen area (i.e., not accessible in training) of the graph. Motivated by this, we formally define the problem of generalized graph anomaly detection that aims to effectively identify anomalies on both the training-domain graph(s) and the unseen test graph(s). Nevertheless, it is a challenging task since only limited labels are available, and the normal data distribution may differ between training and testing data. Accordingly, we propose a data augmentation method named AugAN (Augmentation for Anomaly and Normal distributions) to enrich training data and adopt a customized episodic training strategy for learning with the augmented data. Extensive experiments verify the effectiveness of AugAN in improving model generalizability.
引用
收藏
页码:12721 / 12735
页数:15
相关论文
共 75 条
[1]   Graph based anomaly detection and description: a survey [J].
Akoglu, Leman ;
Tong, Hanghang ;
Koutra, Danai .
DATA MINING AND KNOWLEDGE DISCOVERY, 2015, 29 (03) :626-688
[2]   Outlier Resistant Unsupervised Deep Architectures for Attributed Network Embedding [J].
Bandyopadhyay, Sambaran ;
Lokesh, N. ;
Vivek, Saley Vishal ;
Murty, M. N. .
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, :25-33
[3]   GCCAD: Graph Contrastive Coding for Anomaly Detection [J].
Chen, Bo ;
Zhang, Jing ;
Zhang, Xiaokang ;
Dong, Yuxiao ;
Song, Jian ;
Zhang, Peng ;
Xu, Kaibo ;
Kharlamov, Evgeny ;
Tang, Jie .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) :8037-8051
[4]  
Ding Kaize, 2022, ACM SIGKDD Explorations Newsletter, P61, DOI 10.1145/3575637.3575646
[5]   Few-shot Network Anomaly Detection via Cross-network Meta-learning [J].
Ding, Kaize ;
Zhou, Qinghai ;
Tong, Hanghang ;
Liu, Huan .
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, :2448-2456
[6]   Cross-Domain Graph Anomaly Detection [J].
Ding, Kaize ;
Shu, Kai ;
Shan, Xuan ;
Li, Jundong ;
Liu, Huan .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (06) :2406-2415
[7]  
Ding K, 2019, Data Min, P594
[8]  
Dou Q, 2019, ADV NEUR IN, V32
[9]  
Finn C, 2017, PR MACH LEARN RES, V70
[10]  
Gao J, 2010, P 16 ACM SIGKDD INT, P813, DOI DOI 10.1145/1835804.1835907