Deep Learning at Scale on NVIDIA V100 Accelerators

被引:0
作者
Xu, Rengan [1 ]
Han, Frank [1 ]
Ta, Quy [1 ]
机构
[1] Dell EMC, AI Engn Server & Infrastruct Syst, Austin, TX 78759 USA
来源
PROCEEDINGS OF 2018 IEEE/ACM PERFORMANCE MODELING, BENCHMARKING AND SIMULATION OF HIGH PERFORMANCE COMPUTER SYSTEMS (PMBS 2018) | 2018年
关键词
Deep Learning; Distributed Training; GPU; Benchmarking; V100;
D O I
10.1109/PMBS.2018.00006
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The recent explosion in the popularity of Deep Learning (DL) is due to a combination of improved algorithms, access to large datasets and increased computational power. This had led to a plethora of open-source DL frameworks, each with varying characteristics and capabilities. End users are then left with the difficult task of determining software and hardware configurations to get optimal performance from each framework. We share our experiences and develop best practices for DL training with TensorFlow, MXNet and Caffe2. The paper also looks at DL inferencing with TensorRT on NVIDIA V100 Volta GPUs. It focuses on one of the more prominent neural network architectures, Resnet50, combined with Imagenet dataset. We quantify the impact of hardware attributes on DL workloads such as the usage of PCIe vs NVLink GPUs, performance past a single worker node, effect of high speed interconnect such as InfiniBand EDR on training and the implication of utilizing a network attached storage and its advantages.
引用
收藏
页码:23 / 32
页数:10
相关论文
共 20 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]  
[Anonymous], PROC CVPR IEEE
[3]  
[Anonymous], 2015, Very Deep Convolu- tional Networks for Large-Scale Image Recognition
[4]  
[Anonymous], 2017, PYTORCH TENSORS DYNA
[5]  
[Anonymous], 2016, Deep learning. vol
[6]  
[Anonymous], 2015, P NEUR INF PROC SYST
[7]  
[Anonymous], 2015, 32 ICML
[8]  
[Anonymous], DELL POWERVAULT MD32
[9]  
Carlson Josiah L, 2013, Redis in action
[10]  
Coleman C., 2017, TRAINING, V100, P102