CoarSAS2hvec: Heterogeneous Information Network Embedding with Balanced Network Sampling

被引:11
作者
Zhan, Ling [1 ]
Jia, Tao [1 ]
机构
[1] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
关键词
heterogeneous information networks; network embedding; context sampling; random walk; information entropy;
D O I
10.3390/e24020276
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Heterogeneous information network (HIN) embedding is an important tool for tasks such as node classification, community detection, and recommendation. It aims to find the representations of nodes that preserve the proximity between entities of different nature. A family of approaches that are widely adopted applies random walk to generate a sequence of heterogeneous contexts, from which, the embedding is learned. However, due to the multipartite graph structure of HIN, hub nodes tend to be over-represented to their context in the sampled sequence, giving rise to imbalanced samples of the network. Here, we propose a new embedding method: CoarSAS2hvec. The self-avoiding short sequence sampling with the HIN coarsening procedure (CoarSAS) is utilized to better collect the rich information in HIN. An optimized loss function is used to improve the performance of the HIN structure embedding. CoarSAS2hvec outperforms nine other methods in node classification and community detection on four real-world data sets. Using entropy as a measure of the amount of information, we confirm that CoarSAS catches richer information of the network compared with that through other methods. Hence, the traditional loss function applied to samples by CoarSAS can also yield improved results. Our work addresses a limitation of the random-walk-based HIN embedding that has not been emphasized before, which can shed light on a range of problems in HIN analyses.
引用
收藏
页数:18
相关论文
共 50 条
[31]   An evolutionary clustering algorithm of the heterogeneous information network based on embedding technology [J].
Chen, Limin ;
Yang, Jing ;
Zhang, Jianpei .
Harbin Gongcheng Daxue Xuebao/Journal of Harbin Engineering University, 2015, 36 (05) :692-696and719
[32]   MINE: A method of multi-interaction heterogeneous information network embedding [J].
Zhu D. ;
Sun Y. ;
Li X. ;
Du H. ;
Qu R. ;
Yu P. ;
Piao X. ;
Higgs R. ;
Cao N. .
Computers, Materials and Continua, 2020, 63 (03) :1343-1356
[33]   Label Preserved Heterogeneous Network Embedding [J].
Li, Xiangyu ;
Chen, Weizheng .
NEURAL INFORMATION PROCESSING, ICONIP 2021, PT II, 2021, 13109 :121-132
[34]   Heterogeneous Hyper-Network Embedding [J].
Baytas, Inci M. ;
Xiao, Cao ;
Wang, Fei ;
Jain, Anil K. ;
Zhou, Jiayu .
2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, :875-880
[35]   Dynamic heterogeneous attributed network embedding [J].
Li, Hongbo ;
Zheng, Wenli ;
Tang, Feilong ;
Song, Yitong ;
Yao, Bin ;
Zhu, Yanmin .
INFORMATION SCIENCES, 2024, 662
[36]   Efficient heterogeneous proximity preserving network embedding model [J].
Li, Chen ;
Tang, Ying .
EXPERT SYSTEMS WITH APPLICATIONS, 2019, 134 :201-208
[37]   HPEMed: Heterogeneous Network Pair Embedding for Medical Diagnosis [J].
Li, Mengxi ;
Zhang, Jing ;
Chen, Lixia ;
Fu, Yu ;
Zhou, Cangqi .
COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2021, PT II, 2022, 1492 :364-375
[38]   Dynamic Heterogeneous Information Network Embedding With Meta-Path Based Proximity [J].
Wang, Xiao ;
Lu, Yuanfu ;
Shi, Chuan ;
Wang, Ruijia ;
Cui, Peng ;
Mou, Shuai .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (03) :1117-1132
[39]   Integrity and Robust Network Embedding of Information Network with AAE [J].
Liu, Bin ;
Chen, Yun-fang ;
Zhang, Wei .
2018 INTERNATIONAL CONFERENCE ON ELECTRICAL, CONTROL, AUTOMATION AND ROBOTICS (ECAR 2018), 2018, 307 :277-281
[40]   MSNE: A Novel Markov Chain Sampling Strategy for Network Embedding [J].
Wang, Ran ;
Song, Yang ;
Dai, Xin-yu .
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT III, 2019, 11441 :107-118