FlexPS: Flexible Parallelism Control in Parameter Server Architecture

被引:39
作者
Huang, Yuzhen [1 ]
Jin, Tatiana [1 ]
Wu, Yidi [1 ]
Cai, Zhenkun [1 ]
Yan, Xiao [1 ]
Yang, Fan [1 ]
Li, Jinfeng [1 ]
Guo, Yuying [1 ]
Cheng, James [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Hong Kong, Peoples R China
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2018年 / 11卷 / 05期
关键词
OPTIMIZATION; FRAMEWORK; EFFICIENT; ALGORITHM;
D O I
10.1145/3177732.3177734
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As a general abstraction for coordinating the distributed storage and access of model parameters, the parameter server (PS) architecture enables distributed machine learning to handle large datasets and high dimensional models. Many systems, such as Parameter Server and Petuum, have been developed based on the PS architecture and widely used in practice. However, none of these systems supports changing parallelism during runtime, which is crucial for the efficient execution of machine learning tasks with dynamic workloads. We propose a new system, called FlexPS, which introduces a novel multi-stage abstraction to support flexible parallelism control. With the multi-stage abstraction, a machine learning task can be mapped to a series of stages and the parallelism for a stage can be set according to its workload. Optimizations such as stage scheduler, stage aware consistency controller, and direct model transfer are proposed for the efficiency of multi-stage machine learning in FlexPS. As a general and complete PS systems, FlexPS also incorporates many optimizations that are not limited to multi-stage machine learning. We conduct extensive experiments using a variety of machine learning workloads, showing that FlexPS achieves significant speedups and resource saving compared with the state-of-the-art PS systems such as Petuum and Multiverso.
引用
收藏
页码:566 / 579
页数:14
相关论文
共 47 条
[21]  
Ho Qirong, 2013, Adv Neural Inf Process Syst, V2013, P1223
[22]  
Huang Yan, 2017, Technical report
[23]   Caffe: Convolutional Architecture for Fast Feature Embedding [J].
Jia, Yangqing ;
Shelhamer, Evan ;
Donahue, Jeff ;
Karayev, Sergey ;
Long, Jonathan ;
Girshick, Ross ;
Guadarrama, Sergio ;
Darrell, Trevor .
PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, :675-678
[24]   Heterogeneity-aware Distributed Parameter Servers [J].
Jiang, Jiawei ;
Cui, Bin ;
Zhang, Ce ;
Yu, Lele .
SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2017, :463-478
[25]   Field-aware Factorization Machines for CTR Prediction [J].
Juan, Yuchin ;
Zhuang, Yong ;
Chin, Wei-Sheng ;
Lin, Chih-Jen .
PROCEEDINGS OF THE 10TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'16), 2016, :43-50
[26]   MATRIX FACTORIZATION TECHNIQUES FOR RECOMMENDER SYSTEMS [J].
Koren, Yehuda ;
Bell, Robert ;
Volinsky, Chris .
COMPUTER, 2009, 42 (08) :30-37
[27]  
Li JF, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), P378, DOI 10.1109/BigData.2016.7840626
[28]   Efficient Mini-batch Training for Stochastic Optimization [J].
Li, Muu ;
Zhang, Tong ;
Chen, Yuqiang ;
Smola, Alexander J. .
PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, :661-670
[29]  
Liu J, 2014, PR MACH LEARN RES, V32, P469
[30]   Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud [J].
Low, Yucheng ;
Gonzalez, Joseph ;
Kyrola, Aapo ;
Bickson, Danny ;
Guestrin, Carlos ;
Hellerstein, Joseph M. .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (08) :716-727