ForestLayer: Efficient training of deep forests on distributed task-parallel platforms

被引:19
作者
Zhu, Guanghui [1 ]
Hu, Qiu [1 ]
Gu, Rong [1 ]
Yuan, Chunfeng [1 ]
Huang, Yihua [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Deep forest; Distributed computing; Task-parallel; Random forest; Ray;
D O I
10.1016/j.jpdc.2019.05.001
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Most of the existing deep models are deep neural networks. Recently, the deep forest opens a door towards an alternative to deep neural networks for many tasks and has attracted more and more attention. At the same time, the deep forest model becomes widely used in many real-world applications. However, the existing deep forest system is inefficient and lacks scalability. In this paper, we present ForestLayer, which is an efficient and scalable deep forest system built on distributed task-parallel platforms. First, to improve the computing concurrency and reduce the communication overhead, we propose a fine-grained sub-forest based task-parallel algorithm. Next, we design a novel task splitting mechanism to reduce the training time without decreasing the accuracy of the original method. To further improve the performance of ForestLayer, we propose three system level optimization techniques, including lazy scan, pre-pooling, and partial transmission. Besides the systematic optimization, we also propose a set of high-level programming APIs to improve the ease-of-use of ForestLayer. Finally, we have implemented ForestLayer on the distributed task-parallel platform Ray. The experimental results reveal that ForestLayer outperforms the existing deep forest system gcForest with 7x to 20.9x speedup on a range of datasets. In addition, ForestLayer outperforms TensorFlow-based implementation on most of the datasets, while achieving better predictive performance. Furthermore, ForestLayer achieves good scalability and load balance. (C) 2019 Elsevier Inc. All rights reserved.
引用
收藏
页码:113 / 126
页数:14
相关论文
共 33 条
  • [1] Abadi M., 2015, P 12 USENIX S OPERAT
  • [2] Bache K., 2013, UCI machine learning repository
  • [3] Representation Learning: A Review and New Perspectives
    Bengio, Yoshua
    Courville, Aaron
    Vincent, Pascal
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) : 1798 - 1828
  • [4] Berenbrink P, 2005, LECT NOTES COMPUT SC, V3404, P231
  • [5] On weighted balls-into-bins games
    Berenbrink, Petra
    Friedetzky, Tom
    Hu, Zengjian
    Martin, Russell
    [J]. THEORETICAL COMPUTER SCIENCE, 2008, 409 (03) : 511 - 520
  • [6] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [7] A Bi-layered Parallel Training Architecture for Large-Scale Convolutional Neural Networks
    Chen, Jianguo
    Li, Kenli
    Bilal, Kashif
    Zhou, Xu
    Li, Keqin
    Yu, Philip S.
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (05) : 965 - 976
  • [8] A Parallel Random Forest Algorithm for Big Data in a Spark Cloud Computing Environment
    Chen, Jianguo
    Li, Kenli
    Tang, Zhuo
    Bilal, Kashif
    Yu, Shui
    Weng, Chuliang
    Li, Keqin
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (04) : 919 - 933
  • [9] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [10] Performance-Aware Model for Sparse Matrix-Matrix Multiplication on the Sunway TaihuLight Supercomputer
    Chen, Yuedan
    Li, Kenli
    Yang, Wangdong
    Xiao, Guoqing
    Xie, Xianghui
    Li, Tao
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (04) : 923 - 938