An Asymmetric Distributed Shared Memory Model for Heterogeneous Parallel Systems

被引：52

作者：

Gelado, Isaac ^{[1
]}

Cabezas, Javier ^{[1
]}

Navarro, Nacho ^{[1
]}

Stone, John E. ^{[2
]}

Patel, Sanjay ^{[2
]}

Hwu, Wen-mei W. ^{[2
]}

机构：

[1] Univ Politecn Cataluna, E-08028 Barcelona, Spain

[2] Univ Illinois, Chicago, IL 60680 USA

来源：

ACM SIGPLAN NOTICES | 2010年 / 45卷 / 03期

关键词：

Design; Experimentation; Performance; Heterogeneous Systems; Data-centric Programming Models; Asymmetric Distributed Shared Memory;

D O I：

10.1145/1735971.1736059

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Heterogeneous computing combines general purpose CPUs with accelerators to efficiently execute both sequential control-intensive and data-parallel phases of applications. Existing programming models for heterogeneous computing rely on programmers to explicitly manage data transfers between the CPU system memory and accelerator memory. This paper presents a new programming model for heterogeneous computing, called Asymmetric Distributed Shared Memory (ADSM), that maintains a shared logical memory space for CPUs to access objects in the accelerator physical memory but not vice versa. The asymmetry allows light-weight implementations that avoid common pitfalls of symmetrical distributed shared memory systems. ADSM allows programmers to assign data objects to performance critical methods. When a method is selected for accelerator execution, its associated data objects are allocated within the shared logical memory space, which is hosted in the accelerator physical memory and transparently accessible by the methods executed on CPUs. We argue that ADSM reduces programming efforts for heterogeneous computing systems and enhances application portability. We present a software implementation of ADSM, called GMAC, on top of CUDA in a GNU/Linux environment. We show that applications written in ADSM and running on top of GMAC achieve performance comparable to their counterparts using programmer-managed data transfers. This paper presents the GMAC system and evaluates different design choices. We further suggest additional architectural support that will likely allow GMAC to achieve higher application performance than the current CUDA model.

引用

页码：347 / 358

页数：12

共 47 条

[1] AGARWAL A, 1995, ISCA 95, P2
[2] AHUJA S, 1986, IEEE COMPUT, V19, P26
[3] STAPL: An adaptive, generic parallel C++ library
An, P
Jula, A
Rus, S
Saunders, S
Smith, T
Tanase, G
Thomas, N
Amato, N
Rauchwerger, L
[J]. LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2003, 2624 : 193 - 208
[4] [Anonymous], 2009, OPENCL SPECIFICATION
[5] [Anonymous], TEADMARKS DISTRIBUTE
[6] [Anonymous], SOSP
[7] BAL H, 1988, INT C COMP LANG, P82
[8] Barker K.J., 2008, SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, P1, DOI DOI 10.1109/SC.2008.5217926
[9] Bellens P., 2006, Proceedings of the ACM/IEEE Conference on Supercomputing (SC10), P86, DOI DOI 10.1145/1188455.1188546
[10] BERSHAD BN, 1993, COMPCON SPRING 93, P528

← 1 2 3 4 5 →