Learning Heterogeneous Network Embedding From Text and Links

被引:2
作者
Long, Yunfei [1 ]
Xiang, Rong [1 ]
Lu, Qin [1 ]
Xiong, Dan [1 ]
Huang, Chu-Ren [2 ]
Bi, Chenglin [3 ]
Li, Mingle [4 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
[2] Hong Kong Polytech Univ, Dept Chinese & Bilingual Studies, Hong Kong, Hong Kong, Peoples R China
[3] Adv Micro Devices Shanghai, Shanghai 201203, Peoples R China
[4] Huawei Technol Co Ltd, Shenzhen 518100, Peoples R China
来源
IEEE ACCESS | 2018年 / 6卷
关键词
Network embedding; heterogeneous network; attention mechanism; text processing; INTERNET;
D O I
10.1109/ACCESS.2018.2873044
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Finding methods to represent multiple types of nodes in heterogeneous networks is both challenging and rewarding, as there is much less work in this area compared with that of homogeneous networks. In this paper, we propose a novel approach to learn node embedding for heterogeneous networks through a joint learning framework of both network links and text associated with nodes. A novel attention mechanism is also used to make good use of text extended through links to obtain much larger network context. Link embedding is first learned through a random-walk-based method to process multiple types of links. Text embedding is separately learned at both sentence level and document level to capture salient semantic information more comprehensively. Then, both types of embeddings are jointly fed into a hierarchical neural network model to learn node representation through mutual enhancement. The attention mechanism follows linked edges to obtain context of adjacent nodes to extend context for node representation. The evaluation on a link prediction task in a heterogeneous network data set shows that our method outperforms the current state-of-the-art method by 2.5%-5.0% in AUC values with p-value less than 10(-9), indicating very significant improvement.
引用
收藏
页码:55850 / 55860
页数:11
相关论文
共 48 条
  • [21] Collaborative Filtering for Implicit Feedback Datasets
    Hu, Yifan
    Koren, Yehuda
    Volinsky, Chris
    [J]. ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 263 - +
  • [22] Irsoy O, 2014, EMNLP, P720, DOI DOI 10.3115/V1/D14-1080
  • [23] Leskovec J, 2005, P 11 ACM SIGKDD INT, P177
  • [24] Reducing the Sampling Complexity of Topic Models
    Li, Aaron Q.
    Ahmed, Amr
    Ravi, Sujith
    Smola, Alexander J.
    [J]. PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 891 - 900
  • [25] Long YF, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), P3913, DOI 10.1109/BigData.2016.7841066
  • [26] Automating the construction of internet portals with machine learning
    McCallum, AK
    Nigam, K
    Rennie, J
    Seymore, K
    [J]. INFORMATION RETRIEVAL, 2000, 3 (02): : 127 - 163
  • [27] Mikolov T., 2013, Adv Neural Inf Process Syst, P26, DOI DOI 10.48550/ARXIV.1310.4546
  • [28] Mnih A., 2008, Advances in Neural Information Processing Systems, P1257, DOI DOI 10.5555/2981562.2981720
  • [29] A measure of betweenness centrality based on random walks
    Newman, MEJ
    [J]. SOCIAL NETWORKS, 2005, 27 (01) : 39 - 54
  • [30] Ou MD, 2013, 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), P230