Distributed Matrix Completion

被引:50
作者
Teflioudi, Christina [1 ]
Makari, Faraz [1 ]
Gemulla, Rainer [1 ]
机构
[1] Max Planck Inst Informat, D-66123 Saarbrucken, Germany
来源
12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012) | 2012年
关键词
parallel and distributed matrix factorization; stochastic gradient descent; ALS; recommender systems;
D O I
10.1109/ICDM.2012.120
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We discuss parallel and distributed algorithms for large-scale matrix completion on problems with millions of rows, millions of columns, and billions of revealed entries. We focus on in-memory algorithms that run on a small cluster of commodity nodes; even very large problems can be handled effectively in such a setup. Our DALS, ASGD, and DSGD++ algorithms are novel variants of the popular alternating least squares and stochastic gradient descent algorithms; they exploit thread-level parallelism, in-memory processing, and asynchronous communication. We provide some guidance on the asymptotic performance of each algorithm and investigate the performance of both our algorithms and previously proposed MapReduce algorithms in large-scale experiments. We found that DSGD++ outperforms competing methods in terms of overall runtime, memory consumption, and scalability. Using DSGD++, we can factor a matrix with 10B entries on 16 compute nodes in around 40 minutes.
引用
收藏
页码:655 / 664
页数:10
相关论文
共 21 条
[11]  
Dror G., 2011, KDDCUP 2011 WORKSH
[12]   Collaborative Filtering for Implicit Feedback Datasets [J].
Hu, Yifan ;
Koren, Yehuda ;
Volinsky, Chris .
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, :263-+
[13]   MATRIX FACTORIZATION TECHNIQUES FOR RECOMMENDER SYSTEMS [J].
Koren, Yehuda ;
Bell, Robert ;
Volinsky, Chris .
COMPUTER, 2009, 42 (08) :30-37
[14]  
Kushner H. J., 2003, Stochastic Approximation and Recursive Algorithms and Applications, volume35 of Applications of Mathematics (New York). Stochastic Modelling and Applied Probability, V2nd
[15]  
Liu C., 2010, P 19 INT C WORLD WID, P681, DOI [10.1145/1772690.1772760, DOI 10.1145/1772690.1772760]
[16]  
Mackey Lester., 2011, NIPS, P1
[17]  
McDonald Ryan, 2010, P ANN C N AM CHAPT A, P456
[18]  
Recht B., 2011, Optimization Online
[19]   An Architecture for Parallel Topic Models [J].
Smola, Alexander ;
Narayanamurthy, Shravan .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (01) :703-710
[20]   DISTRIBUTED ASYNCHRONOUS DETERMINISTIC AND STOCHASTIC GRADIENT OPTIMIZATION ALGORITHMS [J].
TSITSIKLIS, JN ;
BERTSEKAS, DP ;
ATHANS, M .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1986, 31 (09) :803-812