An Asymmetric Distributed Shared Memory Model for Heterogeneous Parallel Systems

被引:52
作者
Gelado, Isaac [1 ]
Cabezas, Javier [1 ]
Navarro, Nacho [1 ]
Stone, John E. [2 ]
Patel, Sanjay [2 ]
Hwu, Wen-mei W. [2 ]
机构
[1] Univ Politecn Cataluna, E-08028 Barcelona, Spain
[2] Univ Illinois, Chicago, IL 60680 USA
关键词
Design; Experimentation; Performance; Heterogeneous Systems; Data-centric Programming Models; Asymmetric Distributed Shared Memory;
D O I
10.1145/1735971.1736059
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Heterogeneous computing combines general purpose CPUs with accelerators to efficiently execute both sequential control-intensive and data-parallel phases of applications. Existing programming models for heterogeneous computing rely on programmers to explicitly manage data transfers between the CPU system memory and accelerator memory. This paper presents a new programming model for heterogeneous computing, called Asymmetric Distributed Shared Memory (ADSM), that maintains a shared logical memory space for CPUs to access objects in the accelerator physical memory but not vice versa. The asymmetry allows light-weight implementations that avoid common pitfalls of symmetrical distributed shared memory systems. ADSM allows programmers to assign data objects to performance critical methods. When a method is selected for accelerator execution, its associated data objects are allocated within the shared logical memory space, which is hosted in the accelerator physical memory and transparently accessible by the methods executed on CPUs. We argue that ADSM reduces programming efforts for heterogeneous computing systems and enhances application portability. We present a software implementation of ADSM, called GMAC, on top of CUDA in a GNU/Linux environment. We show that applications written in ADSM and running on top of GMAC achieve performance comparable to their counterparts using programmer-managed data transfers. This paper presents the GMAC system and evaluates different design choices. We further suggest additional architectural support that will likely allow GMAC to achieve higher application performance than the current CUDA model.
引用
收藏
页码:347 / 358
页数:12
相关论文
共 47 条
[21]   THE SCALABLE COHERENT INTERFACE AND RELATED STANDARDS PROJECTS [J].
GUSTAVSON, DB .
IEEE MICRO, 1992, 12 (01) :10-22
[22]   The Chimaera reconfigurable functional unit [J].
Hauck, S ;
Fry, TW ;
Hosler, MM ;
Kao, JP .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2004, 12 (02) :206-217
[23]   Garp: A MIPS processor with a reconfigurable coprocessor [J].
Hauser, JR ;
Wawrzynek, J .
5TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, 1997, :12-21
[24]  
HEINRICH M, 1994, ASPLOS 94, P274
[25]  
HWU WW, 2009, PROGRAMMERS VIEW NEW
[26]  
*IBM STAFF, 2007, SPE RUNT MAN LIB
[27]  
*IMPACT GROUP, PARB BENCHM SUIT
[28]  
*INT STAFF, 2005, INT 945G EXPR CHIPS
[29]  
*INT STAFF, 2008, INT XEN PROC 7400 SE
[30]  
Jiménez VJ, 2009, LECT NOTES COMPUT SC, V5409, P19