PIPES: A Language and Compiler for Task-based Programming on Distributed-Memory Clusters

被引:0
作者
Kong, Martin [1 ]
Pouchet, Louis-Noel [2 ]
Sadayappan, P. [2 ]
Sarkar, Vivek [1 ]
机构
[1] Rice Univ, Houston, TX 77251 USA
[2] Ohio State Univ, Columbus, OH 43210 USA
来源
SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS | 2016年
基金
美国国家科学基金会;
关键词
Distributed computing; Concurrent Collections; task parallelism; macro-dataflow; polyhedral compilation;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Applications running on clusters of shared-memory computers are often implemented using OpenMP+MPI. Productivity can be vastly improved using task-based programming, a paradigm where the user expresses the data and control-flow relations between tasks, offering the runtime maximal freedom to place and schedule tasks. While productivity is increased, high-performance execution remains challenging: the implementation of parallel algorithms typically requires specific task placement and communication strategies to reduce inter-node communications and exploit data locality. In this work, we present a new macro-dataflow programming environment for distributed-memory clusters, based on the Intel Concurrent Collections (CnC) runtime. Our language extensions let the user define virtual topologies, task mappings, task-centric data placement, task and communication scheduling, etc. We introduce a compiler to automatically generate Intel CnC C++ run-time, with key automatic optimizations including task coarsening and coalescing. We experimentally validate our approach on a variety of scientific computations, demonstrating both productivity and performance.
引用
收藏
页码:456 / 467
页数:12
相关论文
共 32 条
[1]   Numerical linear algebra on emerging architectures: the PLASMA and MAGMA projects [J].
Agullo, Emmanuel ;
Demmel, Jim ;
Dongarra, Jack ;
Hadri, Bilel ;
Kurzak, Jakub ;
Langou, Julien ;
Ltaief, Hatem ;
Luszczek, Piotr ;
Tomov, Stanimire .
SCIDAC 2009: SCIENTIFIC DISCOVERY THROUGH ADVANCED COMPUTING, 2009, 180
[2]  
[Anonymous], PLDI
[3]   THE PARADIGM COMPLIER FOR DISTRIBUTED-MEMORY MULTICOMPUTERS [J].
BANERJEE, P ;
CHANDY, JA ;
GUPTA, M ;
HODGES, EW ;
HOLM, JG ;
LAIN, A ;
PALERMO, DJ ;
RAMASWAMY, S ;
SU, E .
COMPUTER, 1995, 28 (10) :37-+
[4]   Code generation in the polyhedral model is easier than you think [J].
Bastoul, C .
13TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION TECHNIQUES, PROCEEDINGS, 2004, :7-16
[5]  
Bhaskaracharya S. G., 2013, ETAPS CC
[6]  
Bondhugula U., 2008, P ACM SIGPLAN 2008 C
[7]  
Burke M.G., 2011, ENCY PARALLEL COMPUT, P364
[8]  
CANNON L. E, 1969, TECH REP
[9]  
Chatterjee S., 2016, ICPP
[10]  
CHOI JY, 1992, FRONTIERS 92 : THE FOURTH SYMPOSIUM ON THE FRONTIERS OF MASSIVELY PARALLEL COMPUTATION, P120, DOI 10.1109/FMPC.1992.234898