PME: Projected Metric Embedding on Heterogeneous Networks for Link Prediction

被引:189
作者
Chen, Hongxu [1 ]
Yin, Hongzhi [1 ]
Wang, Weiqing [2 ]
Wang, Hao [3 ]
Quoc Viet Hung Nguyen [4 ]
Li, Xue [1 ]
机构
[1] Univ Queensland, Brisbane, Qld, Australia
[2] Monash Univ, Melbourne, Vic, Australia
[3] 360 Search Lab, Beijing, Peoples R China
[4] Griffith Univ, Gold Coast, Australia
来源
KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2018年
基金
中国国家自然科学基金;
关键词
Heterogenous Network Embedding; Link Prediction;
D O I
10.1145/3219819.3219986
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Heterogenous information network embedding aims to embed heterogenous information networks (HINs) into low dimensional spaces, in which each vertex is represented as a low-dimensional vector, and both global and local network structures in the original space are preserved. However, most of existing heterogenous information network embedding models adopt the dot product to measure the proximity in the low dimensional space, and thus they can only preserve the first-order proximity and are insufficient to capture the global structure. Compared with homogenous information networks, there are multiple types of links (i.e., multiple relations) in HINs, and the link distribution w.r.t relations is highly skewed. To address the above challenging issues, we propose a novel heterogenous information network embedding model PME based on the metric learning to capture both first-order and second-order proximities in a unified way. To alleviate the potential geometrical inflexibility of existing metric learning approaches, we propose to build object and relation embeddings in separate object space and relation spaces rather than in a common space. Afterwards, we learn embeddings by firstly projecting vertices from object space to corresponding relation space and then calculate the proximity between projected vertices. To overcome the heavy skewness of the link distribution w.r.t relations and avoid "over-sampling" or "under-sampling" for each relation, we propose a novel loss-aware adaptive sampling approach for the model optimization. Extensive experiments have been conducted on a large-scale HIN dataset, and the experimental results show superiority of our proposed PME model in terms of prediction accuracy and scalability.
引用
收藏
页码:1177 / 1186
页数:10
相关论文
共 39 条
  • [1] Ahmed Amr, 2013, WWW, P37
  • [2] [Anonymous], 2017, ARXIV170105291
  • [3] [Anonymous], 2009, UAI'09
  • [4] [Anonymous], 2010, P 4 ACM C REC SYST R, DOI [DOI 10.1145/1864708.1864721, 10.1145/1864708.1864721]
  • [5] [Anonymous], 2009, CIKM, DOI 10.1145/1645953.1646094
  • [6] [Anonymous], 2017, CORR
  • [7] [Anonymous], 2014, PROC 20 ACM SIGKDD, DOI DOI 10.1145/2623330.2623732
  • [8] [Anonymous], 2018, ICDE
  • [9] [Anonymous], 2012, P 5 ACM INT C WEB SE, DOI DOI 10.1145/2124295.2124373
  • [10] Balasubramanian M, 2002, SCIENCE, V295