sonLP: Social Network Link Prediction by Principal Component Regression

被引:0
作者
Bao, Zhifeng [1 ]
Zeng, Yong [1 ]
Tay, Y. C. [1 ]
机构
[1] Natl Univ Singapore, Singapore 117548, Singapore
来源
2013 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM) | 2013年
关键词
link prediction; imbalanced samples;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Social networks are driven by social interaction and therefore dynamic. When modeled as a graph, nodes and links are continually added and deleted, and there is considerable interest in social network analysis on predicting link formation. Current work has not adequately addressed three issues: (1) Most link predictors start with using features from the link topology as input. How do features in other dimensions of the social network data affect link formation? (2) The dynamic nature of social networks implies the features driving link formation are constantly changing. How can a predictor automatically select the features that are important for link formation? (3) Node pairs that are not linked can outnumber links by orders of magnitude, but previous work do not address this imbalance. How can we design a predictor that is robust with respect to link imbalance? This paper presents sonLP, a social network link predictor. It uses principal component analysis to identify features that are important to link prediction, its tradeoff between true and false positives is near optimal for a wide range of link imbalance, and it has optimal time complexity. Experiments with coauthorship prediction in the ACM researcher community also show the importance of using features outside the links' dimension.
引用
收藏
页码:370 / 377
页数:8
相关论文
共 24 条
[11]   The link-prediction problem for social networks [J].
Liben-Nowell, David ;
Kleinberg, Jon .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2007, 58 (07) :1019-1031
[12]  
Lichtenwalter R.N., 2010, P 16 ACM SIGKDD INT, P243
[13]  
Lu Y., 2007, P 15 ACM INT C MULT, P301, DOI [10.1145/1291233.1291297, DOI 10.1145/1291233.129129715]
[14]  
PENNOCK D, 2002, P NATL ACAD SCI, V99
[15]  
Popescul Alexandrin, 2003, P WORKSH LEARN STAT
[16]  
Potgieter A, 2007, SPROUTS WORKING PAPE, V7
[17]  
Reitz F, 2010, LECT NOTES COMPUT SC, V6273, P216, DOI 10.1007/978-3-642-15464-5_23
[18]  
Reuther P, 2006, LECT NOTES COMPUT SC, V4172, P508
[19]  
Rossetti G., 2011, 2011 IEEE International Conference on Data Mining Workshops, P979, DOI 10.1109/ICDMW.2011.150
[20]   Fast principal component analysis using fixed-point algorithm [J].
Sharma, Alok ;
Pahwal, Kuldip K. .
PATTERN RECOGNITION LETTERS, 2007, 28 (10) :1151-1155