node2vec: Scalable Feature Learning for Networks

被引:7288
作者
Grover, Aditya [1 ]
Leskovec, Jure [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
来源
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2016年
基金
美国国家科学基金会;
关键词
Information networks; Feature learning; Node embeddings; Graph representations; PREDICTION; DATABASE;
D O I
10.1145/2939672.2939754
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.
引用
收藏
页码:855 / 864
页数:10
相关论文
共 39 条
  • [21] Community detection in graphs
    Fortunato, Santo
    [J]. PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS, 2010, 486 (3-5): : 75 - 174
  • [22] Gallagher B., 2009, LECT NOTES COMPUTER
  • [23] DISTRIBUTIONAL STRUCTURE
    Harris, Zellig S.
    [J]. WORD-JOURNAL OF THE INTERNATIONAL LINGUISTIC ASSOCIATION, 1954, 10 (2-3): : 146 - 162
  • [24] Hoff P. D., 2002, J AM STAT ASS
  • [25] Knuth DE, 1993, STANFORD GRAPHBASE P, V37
  • [26] Leskovec J., 2014, SNAP DATASETS STANFO
  • [27] Li K., 2014, ICDM
  • [28] Li X., 2014, ICDM
  • [29] The link-prediction problem for social networks
    Liben-Nowell, David
    Kleinberg, Jon
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2007, 58 (07): : 1019 - 1031
  • [30] Molecular signatures database (MSigDB) 3.0
    Liberzon, Arthur
    Subramanian, Aravind
    Pinchback, Reid
    Thorvaldsdottir, Helga
    Tamayo, Pablo
    Mesirov, Jill P.
    [J]. BIOINFORMATICS, 2011, 27 (12) : 1739 - 1740