DLBooster: Boosting End-to-End Deep Learning Workflows with Offloading Data Preprocessing Pipelines

被引:9
作者
Cheng, Yang [1 ]
Li, Dan [2 ]
Guo, Zhiyuan [3 ]
Jiang, Binyao [4 ]
Lin, Jiaxin [3 ]
Fan, Xi [4 ]
Geng, Jinkun [2 ]
Yu, Xinyi [4 ]
Bai, Wei [5 ]
Qu, Lei [5 ]
Shu, Ran [5 ]
Cheng, Peng [5 ]
Xiong, Yongqiang [5 ]
Wu, Jianping [2 ]
机构
[1] Tsinghua Univ, Microsoft Res, Beijing, Peoples R China
[2] Tsinghua Univ, Beijing, Peoples R China
[3] Beihang Univ, Microsoft Res, Beijing, Peoples R China
[4] Shanghai Jiao Tong Univ, Microsoft Res, Shanghai, Peoples R China
[5] Microsoft Res, Redmond, WA USA
来源
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019) | 2019年
基金
中国国家自然科学基金;
关键词
Deep learning; data preprocessing; cloud computing; FPGAs;
D O I
10.1145/3337821.3337892
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, deep learning (DL) has prospered again due to improvements in both computing and learning theory. Emerging studies mostly focus on the acceleration of refining DL models but ignore data preprocessing issues. However, data preprocessing can significantly affect the overall performance of end-to-end DL workflows. Our studies on several image DL workloads show that existing preprocessing backends are quite inefficient: they either perform poorly in throughput (30% degradation) or burn too many (> 10) CPU cores. Based on these observations, we propose DLBooster, a high-performance data preprocessing pipeline that selectively offloads key workloads to FPGAs, to fit the stringent demands on data preprocessing for cutting-edge DL applications. Our testbed experiments show that, compared with the existing baselines, DLBooster can achieve 1.35x similar to 2.4x image processing throughput in several DL workloads, but consumes only 1/10 CPU cores. Besides, it also reduces the latency by 1/3 in online image inference.
引用
收藏
页数:11
相关论文
共 50 条
[1]  
Abadi M, 2016, TENSORFLOW SYSTEM LA
[2]  
Akiba T., 2017, ARXIV171104325
[3]  
[Anonymous], P NSDI 18 RENT WA
[4]  
[Anonymous], CREATE IMAGENET LMDB
[5]  
[Anonymous], P 3 INT C LEARNING R
[6]  
[Anonymous], ICDE 19
[7]  
[Anonymous], NVIDIA DGX 2 WORLDS
[8]  
[Anonymous], 2019, CCIW 19
[9]  
[Anonymous], 2015, Nature, DOI [10.1038/nature14539, DOI 10.1038/NATURE14539]
[10]  
[Anonymous], P NETAI 18