A compound OpenMP/MPI program development toolkit for hybrid CPU/GPU clusters

被引:8
作者
Li, Hung-Fu [1 ]
Liang, Tyng-Yeu [1 ]
Chiu, Jun-Yao [1 ]
机构
[1] Natl Kaohsiung Univ Appl Sci, Dept Elect Engn, Kaohsiung, Taiwan
关键词
Hybrid CPU/GPU clusters; Compound OpenMP/MPI; CUDA; Load balance; Device directive; INTERFACE; COMPILER;
D O I
10.1007/s11227-013-0912-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a program development toolkit called OMPICUDA for hybrid CPU/GPU clusters. With the support of this toolkit, users can make use of a familiar programming model, i.e., compound OpenMP and MPI instead of mixed CUDA and MPI or SDSM to develop their applications on a hybrid CPU/GPU cluster. In addition, they can adapt the types of resources used for executing different parallel regions in the same program by means of an extended device directive according to the property of each parallel region. On the other hand, this programming toolkit supports a set of data-partition interfaces for users to achieve load balance at the application level no matter what type of resources are used for the execution of their programs.
引用
收藏
页码:381 / 405
页数:25
相关论文
共 36 条
[1]  
Almasi George, 2005, P 19 ANN INT C SUP C, P253, DOI [DOI 10.1145/1088149.1088183, 10.1145/1088149.1088183]
[2]   TreadMarks: Shared memory computing on networks of workstations [J].
Amza, C ;
Cox, AL ;
Dwarkadas, S ;
Keleher, P ;
Lu, HH ;
Rajamony, R ;
Yu, WM ;
Zwaenepoel, W .
COMPUTER, 1996, 29 (02) :18-&
[3]  
[Anonymous], 2012, TOP500 LIST
[4]  
[Anonymous], 2007, P LINUX S DTTAW DNTO
[5]  
Basumallik A, 2012, LECT NOTES COMPUTER, V2327, P457
[6]  
Chen Q-k, 2009, P 2009 1 INT C INF S, P26
[7]  
Clark C, 2005, USENIX ASSOCIATION PROCEEDINGS OF THE 2ND SYMPOSIUM ON NETWORKED SYSTEMS DESIGN & IMPLEMENTATION (NSDI '05), P273
[8]   DESIGN OF A SEPARABLE TRANSITION-DIAGRAM COMPILER [J].
CONWAY, ME .
COMMUNICATIONS OF THE ACM, 1963, 6 (07) :396-408
[9]  
Corbalán J, 2004, PROC INT CONF PARAL, P195
[10]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137