ForestLayer: Efficient training of deep forests on distributed task-parallel platforms

被引：19

作者：

Zhu, Guanghui ^{[1
]}

Hu, Qiu ^{[1
]}

Gu, Rong ^{[1
]}

Yuan, Chunfeng ^{[1
]}

Huang, Yihua ^{[1
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China

来源：

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING | 2019年 / 132卷

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

Deep forest; Distributed computing; Task-parallel; Random forest; Ray;

D O I：

10.1016/j.jpdc.2019.05.001

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Most of the existing deep models are deep neural networks. Recently, the deep forest opens a door towards an alternative to deep neural networks for many tasks and has attracted more and more attention. At the same time, the deep forest model becomes widely used in many real-world applications. However, the existing deep forest system is inefficient and lacks scalability. In this paper, we present ForestLayer, which is an efficient and scalable deep forest system built on distributed task-parallel platforms. First, to improve the computing concurrency and reduce the communication overhead, we propose a fine-grained sub-forest based task-parallel algorithm. Next, we design a novel task splitting mechanism to reduce the training time without decreasing the accuracy of the original method. To further improve the performance of ForestLayer, we propose three system level optimization techniques, including lazy scan, pre-pooling, and partial transmission. Besides the systematic optimization, we also propose a set of high-level programming APIs to improve the ease-of-use of ForestLayer. Finally, we have implemented ForestLayer on the distributed task-parallel platform Ray. The experimental results reveal that ForestLayer outperforms the existing deep forest system gcForest with 7x to 20.9x speedup on a range of datasets. In addition, ForestLayer outperforms TensorFlow-based implementation on most of the datasets, while achieving better predictive performance. Furthermore, ForestLayer achieves good scalability and load balance. (C) 2019 Elsevier Inc. All rights reserved.

引用

页码：113 / 126

页数：14

共 33 条

[1] Abadi M., 2015, P 12 USENIX S OPERAT
[2] Bache K., 2013, UCI machine learning repository
[3] Representation Learning: A Review and New Perspectives
Bengio, Yoshua
Courville, Aaron
Vincent, Pascal
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) : 1798 - 1828
[4] Berenbrink P, 2005, LECT NOTES COMPUT SC, V3404, P231
[5] On weighted balls-into-bins games
Berenbrink, Petra
Friedetzky, Tom
Hu, Zengjian
Martin, Russell
[J]. THEORETICAL COMPUTER SCIENCE, 2008, 409 (03) : 511 - 520
[6] Random forests
Breiman, L
[J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
[7] A Bi-layered Parallel Training Architecture for Large-Scale Convolutional Neural Networks
Chen, Jianguo
Li, Kenli
Bilal, Kashif
Zhou, Xu
Li, Keqin
Yu, Philip S.
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (05) : 965 - 976
[8] A Parallel Random Forest Algorithm for Big Data in a Spark Cloud Computing Environment
Chen, Jianguo
Li, Kenli
Tang, Zhuo
Bilal, Kashif
Yu, Shui
Weng, Chuliang
Li, Keqin
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (04) : 919 - 933
[9] XGBoost: A Scalable Tree Boosting System
Chen, Tianqi
Guestrin, Carlos
[J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
[10] Performance-Aware Model for Sparse Matrix-Matrix Multiplication on the Sunway TaihuLight Supercomputer
Chen, Yuedan
Li, Kenli
Yang, Wangdong
Xiao, Guoqing
Xie, Xianghui
Li, Tao
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (04) : 923 - 938

← 1 2 3 4 →