High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn

被引:0
作者
Basilio B. Fraguela
Diego Andrade
机构
[1] Universidade da Coruña,
[2] CITIC-Research Center of Information and Communication Technologies,undefined
来源
The Journal of Supercomputing | 2021年 / 77卷
关键词
Dataflow computing; Hybrid parallelism; PGAS; Runtimes; Programability; High-performance computing;
D O I
暂无
中图分类号
学科分类号
摘要
Dataflow computing is a very attractive paradigm for high-performance computing, given its ability to trigger computations as soon as their inputs are available. UPC++ DepSpawn is a novel task-based library that supports this model in hybrid shared/distributed memory systems on top of a Partitioned Global Address Space environment. While the initial version of the library provided good results, it suffered from a key restriction that heavily limited its performance and scalability. Namely, each process had to consider all the tasks in the application rather than only those of interest to it, an overhead that naturally grows with both the number of processes and tasks in the system. In this paper, this restriction is lifted, enabling our library to provide higher levels of performance. This way, in experiments using 768 cores the performance improved up to 40.1%, the average improvement being 16.1%.
引用
收藏
页码:7676 / 7689
页数:13
相关论文
共 31 条
  • [1] Augonnet C(2011)StarPU: a unified platform for task scheduling on heterogeneous multicore architectures Concurr Comput Pract Exp 23 187-198
  • [2] Thibault S(2012)DAGuE: a generic distributed DAG engine for high performance computing Parallel Comput 38 37-51
  • [3] Namyst R(2007)Parallel programmability and the Chapel language Int J High Perform Comput Appl 21 291-312
  • [4] Wacrenier P(2019)Easy dataflow programming in clusters with UPC++ DepSpawn IEEE Trans Parallel Distrib Syst 30 1267-1282
  • [5] Bosilca G(2013)A framework for argument-based task synchronization with automatic detection of dependencies Parallel Comput 39 475-489
  • [6] Bouteiller A(2006)Advances, applications and performance of the global arrays shared memory programming toolkit Int J High Perform Comput Appl 20 203-231
  • [7] Danalis A(1998)Co-array Fortran for parallel programming SIGPLAN Fortran Forum 17 1-31
  • [8] Hérault T(2012)A high-productivity task-based programming model for clusters Concurr Comput Pract Exp 24 2421-2448
  • [9] Lemarinier P(undefined)undefined undefined undefined undefined-undefined
  • [10] Dongarra J(undefined)undefined undefined undefined undefined-undefined