Random-walk-based or Similarity-based Methods, Which is Better for Directed Graph Embedding?

被引:1
作者
Hamedani, Masoud Reyhani [1 ]
Ryu, Jin-Su [1 ]
Kim, Sang-Wook [1 ]
机构
[1] Hanyang Univ, Seoul, South Korea
来源
2024 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, IEEE BIGCOMP 2024 | 2024年
基金
新加坡国家研究基金会;
关键词
directed graph embedding; random walk; similarity measure;
D O I
10.1109/BigComp60711.2024.00022
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph embedding methods map nodes in a graph into low-dimensional latent vectors that preserve the underlying semantic information in the graph and can be used for graph inference and analysis. Graph embedding is applicable to both directed and undirected graphs; however, it is more challenging for the former ones due to their asymmetric nature. Single-vector embedding methods have limitations in preserving the asymmetric information in directed graphs since they provide a single latent vector for each node in the graph. To alleviate this problem, double-vector embedding methods provide two latent vectors, i.e., source and target, for any nodes; in this paper, we first divide these methods into two following categories based on their graph embedding mechanism: random-walk-based methods (RWM) and similarity-based methods (SBM). Although the effectiveness of the double-vector and single-vector embedding methods are widely studied in the literature, there are no comprehensive studies to evaluate and compare the effectiveness of the state-of-the-art RWM and SBM in directed graph embedding. This paper provides such a study by conducting extensive experiments with seven real-world datasets. Our experimental results demonstrate that 1) the double-vector embedding methods are not always superior to the single-vector ones in directed graph embedding, and 2) superiority of random-walk-based or similarity-based double-vector embedding methods over the others depends on different factors such as the machine learning tasks, datasets, and even evaluation metrics.
引用
收藏
页码:83 / 89
页数:7
相关论文
共 42 条
[1]   Friends and neighbors on the Web [J].
Adamic, LA ;
Adar, E .
SOCIAL NETWORKS, 2003, 25 (03) :211-230
[2]  
[Anonymous], 2006, Data mining concepts and techniques
[3]  
Cao Z., 2007, P 24 INT C MACHINE L, P129
[4]   Learning to Recommend Accurate and Diverse Items [J].
Cheng, Peizhe ;
Wang, Shuaiqiang ;
Ma, Jun ;
Sun, Jiankai ;
Xiong, Hui .
PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'17), 2017, :183-192
[5]   Adversarial Training Methods for Network Embedding [J].
Dai, Quanyu ;
Shen, Xiao ;
Zhang, Liang ;
Li, Qiang ;
Wang, Dan .
WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, :329-339
[6]  
Golub G. H., 2013, Matrix Computations, V4th ed.
[7]   node2vec: Scalable Feature Learning for Networks [J].
Grover, Aditya ;
Leskovec, Jure .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :855-864
[8]   AdaSim: A Recursive Similarity Measure in Graphs [J].
Hamedani, Masoud Reyhani ;
Kim, Sang-Wook .
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, :1528-1537
[9]   JacSim*: An Effective and Efficient Solution to the Pairwise Normalization Problem in SimRank [J].
Hamedani, Masoud Reyhani ;
Kim, Sang-Wook .
IEEE ACCESS, 2021, 9 :146038-146049
[10]   Network embedding: Taxonomies, frameworks and applications [J].
Hou, Mingliang ;
Ren, Jing ;
Zhang, Da ;
Kong, Xiangjie ;
Zhang, Dongyu ;
Xia, Feng .
COMPUTER SCIENCE REVIEW, 2020, 38