Towards a Scalable and Distributed Infrastructure for Deep Learning Applications
被引:4
作者:
Hasheminezhad, Bita
论文数: 0引用数: 0
h-index: 0
机构:
Louisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USALouisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USA
Hasheminezhad, Bita
[1
]
Shirzad, Shahrzad
论文数: 0引用数: 0
h-index: 0
机构:
Louisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USALouisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USA
Shirzad, Shahrzad
[1
]
Wu, Nanmiao
论文数: 0引用数: 0
h-index: 0
机构:
Louisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USALouisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USA
Wu, Nanmiao
[1
]
Diehl, Patrick
论文数: 0引用数: 0
h-index: 0
机构:
Louisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USALouisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USA
Diehl, Patrick
[1
]
Schulz, Hannes
论文数: 0引用数: 0
h-index: 0
机构:
Microsoft Res Montreal, Montreal, PQ, CanadaLouisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USA
Schulz, Hannes
[2
]
论文数: 引用数:
h-index:
机构:
Kaiser, Hartmut
[1
]
机构:
[1] Louisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USA
[2] Microsoft Res Montreal, Montreal, PQ, Canada
来源:
PROCEEDINGS OF 2020 IEEE/ACM 5TH WORKSHOP ON DEEP LEARNING ON SUPERCOMPUTERS (DLS 2020)
|
2020年
关键词:
Distributed Deep Learning;
High Performance Computing;
HPX;
Asynchronous Many-task System;
D O I:
10.1109/DLS51937.2020.00008
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Although recent scaling up approaches to train deep neural networks have proven to be effective, the computational intensity of large and complex models, as well as the availability of large-scale datasets require deep learning frameworks to utilize scaling out techniques. Parallelization approaches and distribution requirements are not considered in the primary designs of most available distributed deep learning frameworks and most of them still are not able to perform effective and efficient fine-grained inter-node communication. We present Phylanx that has the potential to alleviate these shortcomings. Phylanx presents a productivity-oriented frontend where user Python code is translated to a futurized execution tree that can be executed efficiently on multiple nodes using the C++ standard library for parallelism and concurrency (HPX), leveraging fine-grained threading and an active messaging task-based runtime system.