On the Acceleration of Wavefront Applications using Distributed Many-Core Architectures

被引:16
作者
Pennycook, S. J. [1 ]
Hammond, S. D. [1 ]
Mudalige, G. R. [2 ]
Wright, S. A. [1 ]
Jarvis, S. A. [1 ]
机构
[1] Univ Warwick, Dept Comp Sci, Coventry CV4 7AL, W Midlands, England
[2] Univ Oxford, Oxford E Res Ctr, Oxford, England
关键词
wavefront; GPU; many-core computing; CUDA; optimization; performance modelling; GPU;
D O I
10.1093/comjnl/bxr073
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectures to accelerate pipelined wavefront applications-a ubiquitous class of parallel algorithms used for the solution of a number of scientific and engineering applications. Specifically, we employ a recently developed port of the LU solver (from the NAS Parallel Benchmark suite) to investigate the performance of these algorithms on high-performance computing solutions from NVIDIA (Tesla C1060 and C2050) as well as on traditional clusters (AMD/InfiniBand and IBM BlueGene/P). Benchmark results are presented for problem classes A to C and a recently developed performance model is used to provide projections for problem classes D and E, the latter of which represents a billion-cell problem. Our results demonstrate that while the theoretical performance of GPU solutions will far exceed those of many traditional technologies, the sustained application performance is currently comparable for scientific wavefront applications. Finally, a breakdown of the GPU solution is conducted, exposing PCIe overheads and decomposition constraints. A new k-blocking strategy is proposed to improve the future performance of this class of algorithm on GPU-based architectures.
引用
收藏
页码:138 / 153
页数:16
相关论文
共 27 条
  • [1] [Anonymous], 2010, RC24982 TJ WATS RES
  • [2] [Anonymous], 2010, RC25033 TJ WATS RES
  • [3] [Anonymous], 1995, ASCI SWEEP3D BENCHM
  • [4] [Anonymous], 1994, RNR94007 NASA AM RES
  • [5] [Anonymous], 2010, LIV COMP SYST SUMM
  • [6] [Anonymous], 2008, TR0824 VIRG TECH
  • [7] [Anonymous], 1996, NAS9618 NASA AM RES
  • [8] Boyer M., 2009, P IEEE INT PAR DISTR
  • [9] Gharaibeh A., 2010, P ACM IEEE INT C HIG
  • [10] Gong CY, 2010, LECT NOTES COMPUT SC, V6081, P416