NetWalk: A Flexible Deep Embedding Approach for Anomaly Detection in Dynamic Networks

被引:265
作者
Yu, Wenchao [1 ]
Cheng, Wei [2 ]
Aggarwal, Charu C. [3 ]
Zhang, Kai [4 ]
Chen, Haifeng [2 ]
Wang, Wei [1 ]
机构
[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90024 USA
[2] NEC Labs Amer Inc, Princeton, NJ USA
[3] IBM Res AI, San Jose, CA USA
[4] Temple Univ, Dept Comp & Informat Sci, Philadelphia, PA 19122 USA
来源
KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2018年
基金
美国国家科学基金会;
关键词
Anomaly detection; dynamic network embedding; deep autoencoder; clique embedding; OUTLIER DETECTION;
D O I
10.1145/3219819.3220024
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Massive and dynamic networks arise in many practical applications such as social media, security and public health. Given an evolutionary network, it is crucial to detect structural anomalies, such as vertices and edges whose "behaviors" deviate from underlying majority of the network, in a real-time fashion. Recently, network embedding has proven a powerful tool in learning the low-dimensional representations of vertices in networks that can capture and preserve the network structure. However, most existing network embedding approaches are designed for static networks, and thus may not be perfectly suited for a dynamic environment in which the network representation has to be constantly updated. In this paper, we propose a novel approach, NETWALK, for anomaly detection in dynamic networks by learning network representations which can be updated dynamically as the network evolves. We first encode the vertices of the dynamic network to vector representations by clique embedding, which jointly minimizes the pairwise distance of vertex representations of each walk derived from the dynamic networks, and the deep autoencoder reconstruction error serving as a global regularization. The vector representations can be computed with constant space requirements using reservoir sampling. On the basis of the learned low-dimensional vertex representations, a clustering-based technique is employed to incrementally and dynamically detect network anomalies. Compared with existing approaches, NETWALK has several advantages: 1) the network embedding can be updated dynamically, 2) streaming network nodes and edges can be encoded efficiently with constant memory space usage, 3). flexible to be applied on different types of networks, and 4) network anomalies can be detected in real-time. Extensive experiments on four real datasets demonstrate the effectiveness of NETWALK.
引用
收藏
页码:2672 / 2681
页数:10
相关论文
共 44 条
[1]  
Aggarwal C.C., 2010, P SIAM INT C DATA MI, P478
[2]  
Aggarwal CC, 2011, PROC INT CONF DATA, P399, DOI 10.1109/ICDE.2011.5767885
[3]  
Akoglu L., 2013, P 6 ACM INT C WEB SE, DOI DOI 10.1145/2433396.2433496
[4]   Graph based anomaly detection and description: a survey [J].
Akoglu, Leman ;
Tong, Hanghang ;
Koutra, Danai .
DATA MINING AND KNOWLEDGE DISCOVERY, 2015, 29 (03) :626-688
[5]  
[Anonymous], 2005, 5 IEEE INT C DAT MIN
[6]  
[Anonymous], 2012, OUTLIER ANAL
[7]  
[Anonymous], 2010, PAKDD
[8]  
[Anonymous], SDM
[9]  
[Anonymous], 2012, ser. KDD '12, DOI DOI 10.1145/2339530.2339667
[10]  
[Anonymous], 2007, TKDD, DOI DOI 10.1145/1232722.1232727