High-order Proximity Preserving Information Network Hashing

被引:33
作者
Lian, Defu [1 ]
Zheng, Kai [1 ]
Zheng, Vincent W. [2 ]
Ge, Yong [3 ]
Cao, Longbing [4 ]
Tsang, Ivor W. [5 ]
Xie, Xing [6 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Sichuan, Peoples R China
[2] Adv Digital Sci Ctr, Singapore, Singapore
[3] Univ Arizona, Management Informat Syst, Tucson, AZ 85721 USA
[4] Univ Technol Sydney, Adv Analyt Inst, Sydney, NSW, Australia
[5] Univ Technol Sydney, Ctr Artificial Intelligence, Sydney, NSW, Australia
[6] Microsoft Res Asia, Beijing, Peoples R China
来源
KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2018年
基金
中国国家自然科学基金; 澳大利亚研究理事会; 新加坡国家研究基金会;
关键词
Information Network Hashing; Matrix Factorization; Hamming Subspace Learning;
D O I
10.1145/3219819.3220034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information network embedding is an effective way for efficient graph analytics. However, it still faces with computational challenges in problems such as link prediction and node recommendation, particularly with increasing scale of networks. Hashing is a promising approach for accelerating these problems by orders of magnitude. However, no prior studies have been focused on seeking binary codes for information networks to preserve high-order proximity. Since matrix factorization (MF) unifies and outperforms several well-known embedding methods with high-order proximity preserved, we propose a MF-based Information Network Hashing (INH-MF) algorithm, to learn binary codes which can preserve high-order proximity. We also suggest Hamming subspace learning, which only updates partial binary codes each time, to scale up INH-MF. We finally evaluate INH-MF on four real-world information network datasets with respect to the tasks of node classification and node recommendation. The results demonstrate that INH-MF can perform significantly better than competing learning to hash baselines in both tasks, and surprisingly outperforms network embedding methods, including DeepWalk, LINE and NetMF, in the task of node recommendation. The source code of INH-MF is available online(1).
引用
收藏
页码:1744 / 1753
页数:10
相关论文
共 40 条
  • [11] A procrustes problem on the Stiefel manifold
    Eldén, L
    Park, H
    [J]. NUMERISCHE MATHEMATIK, 1999, 82 (04) : 599 - 619
  • [12] Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval
    Gong, Yunchao
    Lazebnik, Svetlana
    Gordo, Albert
    Perronnin, Florent
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (12) : 2916 - 2929
  • [13] node2vec: Scalable Feature Learning for Networks
    Grover, Aditya
    Leskovec, Jure
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 855 - 864
  • [14] Hamilton W., 2017, Adv. Neural Inf. Process. Syst., P1024
  • [15] Some optimal inapproximability results
    Håstad, J
    [J]. JOURNAL OF THE ACM, 2001, 48 (04) : 798 - 859
  • [16] Horn R. A., 1985, Matrix analysis, DOI [10.1017/CBO9780511810817, DOI 10.1017/CBO9780511810817]
  • [17] Collaborative Filtering for Implicit Feedback Datasets
    Hu, Yifan
    Koren, Yehuda
    Volinsky, Chris
    [J]. ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 263 - +
  • [18] Jiang QY, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P2248
  • [19] Kipf TN, 2016, ARXIV
  • [20] Kong Weihao, 2012, NIPS, P1646