Network Sampling Using k-hop Random Walks for Heterogeneous Network Embedding

被引:3
作者
Anil, Akash [1 ]
Singhal, Shubham [1 ]
Jain, Piyush [1 ]
Singh, Sanasam Ranbir [1 ]
Ladhar, Ajay [2 ]
Singh, Sandeep [2 ]
Chugh, Uppinder [1 ]
机构
[1] Indian Inst Technol Guwahati, Gauhati, Assam, India
[2] Natl Inst Technol Silchar, Silchar, Assam, India
来源
PROCEEDINGS OF THE 6TH ACM IKDD CODS AND 24TH COMAD | 2019年
关键词
Heterogeneous Network; RandomWalk; Network Embedding; DBLP; Co-authorship; Network Sampling;
D O I
10.1145/3297001.3297060
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Capturing neighborhood information by generating node sequences or node samples is an important prerequisite step for many of the neural network embedding approaches. Majority of the recent studies on neural network embedding exploit random walk as a sampling method, which traverses through adjacent neighbors to generate the node sequences. Traversing through only immediate neighbor may not be suitable particularly for heterogeneous information networks (HIN) where adjacent nodes tend to belong to different types. Therefore, this paper proposes a random walk based sampling approach (RW-k) which generates the node sequences such that adjacent nodes in the sequence are separated by k edges preserving the k-hop proximity characteristics. We exploit the node sequences generated using RW- k sampling for network embedding using skip-gram model. Thereafter, the performance of network embedding is evaluated on future co-authorship prediction task over three heterogeneous bibliographic networks. We compare the efficacy of network embedding using proposedRW-k sampling with recently proposed network embedding models based on random walks namely, Metapath2vec, Node2vec and VERSE. It is evident that the RW- k yields better quality of embedding and out-performs baselines in majority of the cases.
引用
收藏
页码:354 / 357
页数:4
相关论文
共 10 条
[1]  
[Anonymous], 2012, P 21 ACM INT C INFOR
[2]  
Cai Hongyun, 2018, TKDE
[3]   metapath2vec: Scalable Representation Learning for Heterogeneous Networks [J].
Dong, Yuxiao ;
Chawla, Nitesh V. ;
Swami, Ananthram .
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, :135-144
[4]   node2vec: Scalable Feature Learning for Networks [J].
Grover, Aditya ;
Leskovec, Jure .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :855-864
[5]  
Haveliwala T. H., 2002, P 11 INT C WORLD WID, P517, DOI [DOI 10.1145/511446.511513(CIT, 10.1145/511446.511513, DOI 10.1145/511446.511513]
[6]  
Liu HD, 2012, IEEE IMAGE PROC, P597, DOI 10.1109/ICIP.2012.6466930
[7]  
Mikolov T., 2013, Advances in Neural Information Processing Systems, V26, P1
[8]  
Perozzi B., 2014, SIGKDD, P701
[9]   VERSE: Versatile Graph Embeddings from Similarity Measures [J].
Tsitsulin, Anton ;
Mottin, Davide ;
Karras, Panagiotis ;
Mueller, Emmanuel .
WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018), 2018, :539-548
[10]   Variation in accumulation, transport, and distribution of phthalic acid esters (PAEs) in soil columns grown with low- and high-PAE accumulating rice cultivars [J].
Wu, Yang ;
Chen, Xue-Xue ;
Zhu, Ting-Kai ;
Li, Xing ;
Chen, Xiao-Hong ;
Mo, Ce-Hui ;
Li, Yan-Wen ;
Cai, Quan-Ying ;
Wong, Ming-Hung .
ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2018, 25 (18) :17768-17780