Multi-Target Vectorization with MTPS C plus plus Generic Library

被引:0
作者
Kirschenmann, Wilfried
Plagne, Laurent
Vialle, Stephane
机构
来源
APPLIED PARALLEL AND SCIENTIFIC COMPUTING, PT II | 2012年 / 7134卷
关键词
GPU; SSE; Vectorization; C plus plus Template Metaprogramming; Performances;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article introduces a C++ template library dedicated at vectorizing algorithms for different target architectures: Multi-Target Parallel Skeleton (MTPS). Skeletons describing the data structures and algorithms are provided and allow MTPS to generate a code with optimized memory access patterns for the choosen architecture. MTPS currently supports x86-64 multicore CPUs and CUDA enabled GPUs. On these architectures, performances close to hardware limits are observed.
引用
收藏
页码:336 / 346
页数:11
相关论文
共 13 条
[1]  
[Anonymous], 2003, 148822003 ISOIEC
[2]  
Baker C. G., 2010, PDP 2010
[3]  
Czarnecki K, 2003, LECT NOTES COMPUT SC, V3016, P51
[4]   QUAFF: efficient C plus plus design for parallel skeletons [J].
Falcou, J. ;
Serot, J. ;
Chateau, T. ;
Lapreste, J. T. .
PARALLEL COMPUTING, 2006, 32 (7-8) :604-615
[5]   PARALLEL SPN ON MULTI-CORE CPUS AND MANY-CORE GPUS [J].
Kirschenmann, W. ;
Plagne, L. ;
Poncot, A. ;
Vialle, S. .
TRANSPORT THEORY AND STATISTICAL PHYSICS, 2010, 39 (2-4) :255-281
[6]  
KIRSCHENMANN W, 2009, P MATH COMP METH REA
[7]  
Kirschenmann W., 2009, POOSC 2009
[8]  
McCalpin J. D., 1995, IEEE Technical Committee on Computer Architecture Newsletter, V1995, P19
[9]  
NVIDIA, 2010, NVIDIA CUDA C PROGR
[10]  
PLAGNE L, 2005, P MATH COMP SUP REAC