An Information-Explainable Random Walk Based Unsupervised Network Representation Learning Framework on Node Classification Tasks

被引:0
作者
Xu, Xin [1 ]
Lu, Yang [1 ]
Zhou, Yupeng [1 ]
Fu, Zhiguo [1 ]
Fu, Yanjie [2 ]
Yin, Minghao [1 ]
机构
[1] Northeast Normal Univ, Coll Informat Sci & Technol, Dept Comp Sci, Changchun 130117, Peoples R China
[2] Univ Cent Florida, Coll Engn & Comp Sci, Dept Comp Sci, Orlando, FL 32816 USA
关键词
network representation learning; random walk; stationary distributions; unsupervised learning; network embedding; PREDICTION;
D O I
10.3390/math9151767
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Network representation learning aims to learn low-dimensional, compressible, and distributed representational vectors of nodes in networks. Due to the expensive costs of obtaining label information of nodes in networks, many unsupervised network representation learning methods have been proposed, where random walk strategy is one of the wildly utilized approaches. However, the existing random walk based methods have some challenges, including: 1. The insufficiency of explaining what network knowledge in the walking path-samplings; 2. The adverse effects caused by the mixture of different information in networks; 3. The poor generality of the methods with hyper-parameters on different networks. This paper proposes an information-explainable random walk based unsupervised network representation learning framework named Probabilistic Accepted Walk (PAW) to obtain network representation from the perspective of the stationary distribution of networks. In the framework, we design two stationary distributions based on nodes' self-information and local-information of networks to guide our proposed random walk strategy to learn representational vectors of networks through sampling paths of nodes. Numerous experimental results demonstrated that the PAW could obtain more expressive representation than the other six widely used unsupervised network representation learning baselines on four real-world networks in single-label and multi-label node classification tasks.
引用
收藏
页数:14
相关论文
共 28 条
  • [1] [Anonymous], 2016, ARXIV161106645
  • [2] Blum Avrim, 2016, VORABVERSION LEHRBUC
  • [3] Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding
    Cannistraci, Carlo Vittorio
    Alanis-Lobato, Gregorio
    Ravasi, Timothy
    [J]. BIOINFORMATICS, 2013, 29 (13) : 199 - 209
  • [4] Stan: A Probabilistic Programming Language
    Carpenter, Bob
    Gelman, Andrew
    Hoffman, Matthew D.
    Lee, Daniel
    Goodrich, Ben
    Betancourt, Michael
    Brubaker, Marcus A.
    Guo, Jiqiang
    Li, Peter
    Riddell, Allen
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2017, 76 (01): : 1 - 29
  • [5] Waste Management Analysis in Developing Countries through Unsupervised Classification of Mixed Data
    Caruso, Giulia
    Gattone, Stefano Antonio
    [J]. SOCIAL SCIENCES-BASEL, 2019, 8 (06):
  • [6] UNDERSTANDING THE METROPOLIS-HASTINGS ALGORITHM
    CHIB, S
    GREENBERG, E
    [J]. AMERICAN STATISTICIAN, 1995, 49 (04) : 327 - 335
  • [7] Donnat C, 2020, P MACH LEARN HLTH NE P MACH LEARN HLTH NE
  • [8] Green PJ, 1995, BIOMETRIKA, V82, P711, DOI 10.2307/2337340
  • [9] node2vec: Scalable Feature Learning for Networks
    Grover, Aditya
    Leskovec, Jure
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 855 - 864
  • [10] MONTE-CARLO SAMPLING METHODS USING MARKOV CHAINS AND THEIR APPLICATIONS
    HASTINGS, WK
    [J]. BIOMETRIKA, 1970, 57 (01) : 97 - &