MACAU: SCALABLE BAYESIAN FACTORIZATION WITH HIGH-DIMENSIONAL SIDE INFORMATION USING MCMC

被引:19
作者
Simm, J. [1 ,2 ]
Arany, A. [1 ,2 ]
Zakeri, P. [1 ,2 ]
Haber, T. [3 ]
Wegner, J. K. [4 ]
Chupakhin, V. [4 ]
Ceulemans, H. [4 ]
Moreau, Y. [1 ,2 ]
机构
[1] Katholieke Univ Leuven, ESAT STADIUS, B-3001 Heverlee, Belgium
[2] IMEC, B-3001 Heverlee, Belgium
[3] Hasselt Univ, B-3500 Hasselt, Belgium
[4] Janssen Pharmaceut, B-2340 Beerse, Belgium
来源
2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING | 2017年
关键词
matrix factorization; side information; high scale machine learning; MCMC;
D O I
10.1109/mlsp.2017.8168143
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Bayesian matrix factorization is a method of choice for making predictions for large-scale incomplete matrices, due to availability of efficient Gibbs sampling schemes and its robustness to overfitting. In this paper, we consider factorization of large scale matrices with high-dimensional side information. However, sampling the link matrix for the side information with standard approaches costs O (F-3) time, where F is the dimensionality of the features. To overcome this limitation we, firstly, propose a prior for the link matrix whose strength is proportional to the scale of latent variables. Secondly, using this prior we derive an efficient sampler, with linear complexity in the number of non-zeros, O (N-nz), by leveraging Krylov subspace methods, such as block conjugate gradient, allowing us to handle million-dimensional side information. We demonstrate the effectiveness of our proposed method in drug-protein interaction prediction task.
引用
收藏
页数:6
相关论文
共 13 条
[1]  
Agarwal D, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P19
[2]   The ChEMBL bioactivity database: an update [J].
Bento, A. Patricia ;
Gaulton, Anna ;
Hersey, Anne ;
Bellis, Louisa J. ;
Chambers, Jon ;
Davies, Mark ;
Krueger, Felix A. ;
Light, Yvonne ;
Mak, Lora ;
McGlinchey, Shaun ;
Nowotka, Michal ;
Papadatos, George ;
Santos, Rita ;
Overington, John P. .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D1083-D1090
[3]   A BLOCK CONJUGATE-GRADIENT METHOD APPLIED TO LINEAR-SYSTEMS WITH MULTIPLE RIGHT-HAND SIDES [J].
FENG, YT ;
OWEN, DRJ ;
PERIC, D .
COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 1995, 127 (1-4) :203-215
[4]  
Fox C., 2013, SPRINGER P MATH STAT, V65, P349
[5]  
Gutknecht M.H, 2006, Block krylov space methods for linear systems with multiple right-hand sides: an introduction
[6]  
Kim Yong-Deok, 2014, P INT C ART INT STAT
[7]  
Nocedal J, 2006, SPRINGER SER OPER RE, P1, DOI 10.1007/978-0-387-40065-5
[8]   THE BLOCK CONJUGATE-GRADIENT ALGORITHM AND RELATED METHODS [J].
OLEARY, DP .
LINEAR ALGEBRA AND ITS APPLICATIONS, 1980, 29 (FEB) :293-322
[9]  
Park Sunho, 2013, P 23 INT JOINT C ART, P1593
[10]  
Porteous Ian., 2010, AAAI