DLBooster: Boosting End-to-End Deep Learning Workflows with Offloading Data Preprocessing Pipelines

被引：9

作者：

Cheng, Yang ^{[1
]}

Li, Dan ^{[2
]}

Guo, Zhiyuan ^{[3
]}

Jiang, Binyao ^{[4
]}

Lin, Jiaxin ^{[3
]}

Fan, Xi ^{[4
]}

Geng, Jinkun ^{[2
]}

Yu, Xinyi ^{[4
]}

Bai, Wei ^{[5
]}

Qu, Lei ^{[5
]}

Shu, Ran ^{[5
]}

Cheng, Peng ^{[5
]}

Xiong, Yongqiang ^{[5
]}

Wu, Jianping ^{[2
]}

机构：

[1] Tsinghua Univ, Microsoft Res, Beijing, Peoples R China

[2] Tsinghua Univ, Beijing, Peoples R China

[3] Beihang Univ, Microsoft Res, Beijing, Peoples R China

[4] Shanghai Jiao Tong Univ, Microsoft Res, Shanghai, Peoples R China

[5] Microsoft Res, Redmond, WA USA

来源：

PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019) | 2019年

基金：

中国国家自然科学基金;

关键词：

Deep learning; data preprocessing; cloud computing; FPGAs;

D O I：

10.1145/3337821.3337892

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, deep learning (DL) has prospered again due to improvements in both computing and learning theory. Emerging studies mostly focus on the acceleration of refining DL models but ignore data preprocessing issues. However, data preprocessing can significantly affect the overall performance of end-to-end DL workflows. Our studies on several image DL workloads show that existing preprocessing backends are quite inefficient: they either perform poorly in throughput (30% degradation) or burn too many (> 10) CPU cores. Based on these observations, we propose DLBooster, a high-performance data preprocessing pipeline that selectively offloads key workloads to FPGAs, to fit the stringent demands on data preprocessing for cutting-edge DL applications. Our testbed experiments show that, compared with the existing baselines, DLBooster can achieve 1.35x similar to 2.4x image processing throughput in several DL workloads, but consumes only 1/10 CPU cores. Besides, it also reduces the latency by 1/3 in online image inference.

引用

页数：11

共 50 条

[1]

Abadi M, 2016, TENSORFLOW SYSTEM LA

[2]

Akiba T., 2017, ARXIV171104325

[3]

[Anonymous], P NSDI 18 RENT WA

[4]

[Anonymous], CREATE IMAGENET LMDB

[5]

[Anonymous], P 3 INT C LEARNING R

[6]

[Anonymous], ICDE 19

[7]

[Anonymous], NVIDIA DGX 2 WORLDS

[8]

[Anonymous], 2019, CCIW 19

[9]

[Anonymous], 2015, Nature, DOI [10.1038/nature14539, DOI 10.1038/NATURE14539]

[10]

[Anonymous], P NETAI 18

← 1 2 3 4 5 →