RAJA: Portable Performance for Large-Scale Scientific Applications

被引:157
作者
Beckingsale, David Alexander [1 ]
Burmark, Jason [1 ]
Hornung, Rich [1 ]
Jones, Holger [1 ]
Killian, William [1 ]
Kunen, Adam J. [1 ]
Pearce, Olga [1 ]
Robinson, Peter [1 ]
Ryujin, Brian S. [1 ]
Scogland, Thomas R. W. [1 ]
机构
[1] Lawrence Livermore Natl Lab, Livermore, CA 94550 USA
来源
PROCEEDINGS OF P3HPC 2019: 2019 IEEE/ACM INTERNATIONAL WORKSHOP ON PERFORMANCE, PORTABILITY AND PRODUCTIVITY IN HPC (P3HPC) | 2019年
关键词
D O I
10.1109/P3HPC49587.2019.00012
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Modern high-performance computing systems are diverse, with hardware designs ranging from homogeneous mult-icore CPUs to GPU or FPGA accelerated systems. Achieving desirable application performance often requires choosing a programming model best suited to a particular platform. For large codes used daily in production that are under continual development, architecture-specific ports are untenable. Maintainability requires single-source application code that is performance portable across a range of architectures and programming models. In this paper we describe RAJA, a portability layer that enables C++ applications to leverage various programming models, and thus architectures, with a single-source codebase. We describe preliminary results using RAJA in three large production codes at Lawrence Livermore National Laboratory, observing 17x, 13x and 12x speedup on GPU-only over CPUonly nodes with single-source application code in each case.
引用
收藏
页码:71 / 81
页数:11
相关论文
共 12 条
[1]  
[Anonymous], 2015, SHOCK WAVES
[2]   Apollo: Reusable Models for Fast, Dynamic Tuning of Input-Dependent Code [J].
Beckingsale, David ;
Pearce, Olga ;
Laguna, Ignacio ;
Gamblin, Todd .
2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, :307-316
[3]   A study of ALE simulations of Rayleigh-Taylor instability [J].
Darlington, RM ;
McAbee, TL ;
Rodrigue, G .
COMPUTER PHYSICS COMMUNICATIONS, 2001, 135 (01) :58-73
[4]  
Dolbeau R., 2007, GPGPU 2007 WORKSH GE
[5]   Kokkos: Enabling manycore performance portability through polymorphic memory access patterns [J].
Edwards, H. Carter ;
Trott, Christian R. ;
Sunderland, Daniel .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (12) :3202-3216
[6]  
Hanebutte U., 1999, UCRLTB132078
[7]  
Hornung R.D., 2013, LLNLTR635681
[8]  
Lewis E., 1993, Computational Methods of Neutron Transport
[9]  
McMahon F. H., 1986, UCRL53745
[10]  
Nickolls John, 2008, ACM Queue, V6, DOI 10.1145/1365490.1365500