Topo2Vec: A Novel Node Embedding Generation Based on Network Topology for Link Prediction

被引:18
作者
Mallick, Koushik [1 ]
Bandyopadhyay, Sanghamitra [1 ]
Chakraborty, Subhasis [2 ]
Choudhuri, Rounaq [2 ]
Bose, Sayan [2 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700108, India
[2] RCC Inst Informat Technol, Kolkata 700015, India
关键词
Data mining; feature learning; graph representation; link prediction; node embedding; pairwise classification;
D O I
10.1109/TCSS.2019.2950589
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Link prediction of a scale-free network has become relevant for problems relating to social network analysis, recommendation system, and in the domain of bioinformatics. In recently proposed approaches, the sampling of nodes of a network is done by simulating random walk. The generated node samples are used to train a neural network to learn the contextual information through an embedding vector. This method has gained popularity as the embedding vector is produced from the primitive node adjacency information and has achieved an outstanding performance. In this article, a naive and scalable approach for generating the node samples based on the principle of goal-oriented greedy searching has been proposed. The generated node samples have low noisy structures, which can represent the relation of edges of the network in a better way, compared to the state-of-the-art methods. Consequently, better representation of feature embedding of nodes of the network is generated from the samples. The learned feature vectors are used for solving the link prediction problem using a pairwise kernel support vector machine (SVM) classifier, which is computationally and spatially expensive in nature. Therefore, as an alternative, we have chosen to deploy a random forest (RF) classifier with a new algebraic operation to obtain the symmetric pairwise feature representation of a node pair. We demonstrated the efficacy of the proposed Topo2vec by testing it against the state-of-the-art network context generation algorithms in several real-world networks. Altogether, the proposed Topo2vec algorithm is a new method for solving the link prediction problem with entirely different settings.
引用
收藏
页码:1306 / 1317
页数:12
相关论文
共 39 条
[1]  
Ahmed A., 2013, WWW
[2]   Statistical mechanics of complex networks [J].
Albert, R ;
Barabási, AL .
REVIEWS OF MODERN PHYSICS, 2002, 74 (01) :47-97
[3]  
[Anonymous], 2003, Proceedings of the KDD-2003 Workshop on Data Cleaning, Record Linkage, and Object Consolidation
[4]   FOCS: Fast Overlapped Community Search [J].
Bandyopadhyay, Sanghamitra ;
Chowdhary, Garisha ;
Sengupta, Debarka .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (11) :2974-2985
[5]  
Belkin M, 2002, ADV NEUR IN, V14, P585
[6]   Kernel methods for predicting protein-protein interactions [J].
Ben-Hur, A ;
Noble, WS .
BIOINFORMATICS, 2005, 21 :I38-I46
[7]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[8]  
Bhagat S, 2011, SOCIAL NETWORK DATA ANALYTICS, P115
[9]   The anatomy of a large-scale hypertextual Web search engine [J].
Brin, S ;
Page, L .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7) :107-117
[10]  
Brunner C, 2012, J MACH LEARN RES, V13, P2279