An Information-Explainable Random Walk Based Unsupervised Network Representation Learning Framework on Node Classification Tasks

被引：0

作者：

Xu, Xin ^{[1
]}

Lu, Yang ^{[1
]}

Zhou, Yupeng ^{[1
]}

Fu, Zhiguo ^{[1
]}

Fu, Yanjie ^{[2
]}

Yin, Minghao ^{[1
]}

机构：

[1] Northeast Normal Univ, Coll Informat Sci & Technol, Dept Comp Sci, Changchun 130117, Peoples R China

[2] Univ Cent Florida, Coll Engn & Comp Sci, Dept Comp Sci, Orlando, FL 32816 USA

来源：

MATHEMATICS | 2021年 / 9卷 / 15期

关键词：

network representation learning; random walk; stationary distributions; unsupervised learning; network embedding; PREDICTION;

D O I：

10.3390/math9151767

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Network representation learning aims to learn low-dimensional, compressible, and distributed representational vectors of nodes in networks. Due to the expensive costs of obtaining label information of nodes in networks, many unsupervised network representation learning methods have been proposed, where random walk strategy is one of the wildly utilized approaches. However, the existing random walk based methods have some challenges, including: 1. The insufficiency of explaining what network knowledge in the walking path-samplings; 2. The adverse effects caused by the mixture of different information in networks; 3. The poor generality of the methods with hyper-parameters on different networks. This paper proposes an information-explainable random walk based unsupervised network representation learning framework named Probabilistic Accepted Walk (PAW) to obtain network representation from the perspective of the stationary distribution of networks. In the framework, we design two stationary distributions based on nodes' self-information and local-information of networks to guide our proposed random walk strategy to learn representational vectors of networks through sampling paths of nodes. Numerous experimental results demonstrated that the PAW could obtain more expressive representation than the other six widely used unsupervised network representation learning baselines on four real-world networks in single-label and multi-label node classification tasks.

引用

页数：14

共 28 条

[1] [Anonymous], 2016, ARXIV161106645
[2] Blum Avrim, 2016, VORABVERSION LEHRBUC
[3] Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding
Cannistraci, Carlo Vittorio
Alanis-Lobato, Gregorio
Ravasi, Timothy
[J]. BIOINFORMATICS, 2013, 29 (13) : 199 - 209
[4] Stan: A Probabilistic Programming Language
Carpenter, Bob
Gelman, Andrew
Hoffman, Matthew D.
Lee, Daniel
Goodrich, Ben
Betancourt, Michael
Brubaker, Marcus A.
Guo, Jiqiang
Li, Peter
Riddell, Allen
[J]. JOURNAL OF STATISTICAL SOFTWARE, 2017, 76 (01): : 1 - 29
[5] Waste Management Analysis in Developing Countries through Unsupervised Classification of Mixed Data
Caruso, Giulia
Gattone, Stefano Antonio
[J]. SOCIAL SCIENCES-BASEL, 2019, 8 (06):
[6] UNDERSTANDING THE METROPOLIS-HASTINGS ALGORITHM
CHIB, S
GREENBERG, E
[J]. AMERICAN STATISTICIAN, 1995, 49 (04) : 327 - 335
[7] Donnat C, 2020, P MACH LEARN HLTH NE P MACH LEARN HLTH NE
[8] Green PJ, 1995, BIOMETRIKA, V82, P711, DOI 10.2307/2337340
[9] node2vec: Scalable Feature Learning for Networks
Grover, Aditya
Leskovec, Jure
[J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 855 - 864
[10] MONTE-CARLO SAMPLING METHODS USING MARKOV CHAINS AND THEIR APPLICATIONS
HASTINGS, WK
[J]. BIOMETRIKA, 1970, 57 (01) : 97 - &

← 1 2 3 →