Energy cost evaluation of parallel algorithms for multiprocessor systems

被引：11

作者：

Wang, Zhuowei ^{[1
]}

Xu, Xianbin ^{[1
]}

Xiong, Naixue ^{[2
]}

Yang, Laurence T. ^{[3
]}

Zhao, Wuqing ^{[1
]}

机构：

[1] Wuhan Univ, Sch Comp, Wuhan 430000, Peoples R China

[2] Georgia State Univ, Dept Comp Sci, Atlanta, GA 30303 USA

[3] St Francis Xavier Univ, Dept Comp Sci, Antigonish, NS B2G 1C0, Canada

来源：

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2013年 / 16卷 / 01期

关键词：

GPUs; Parallel algorithms; Energy scalability; Energy conservation; Performance; GPU;

D O I：

10.1007/s10586-011-0188-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the continuous development of hardware and software, Graphics Processor Units (GPUs) have been used in the general-purpose computation field. They have emerged as a computational accelerator that dramatically reduces the application execution time with CPUs. To achieve high computing performance, a GPU typically includes hundreds of computing units. The high density of computing resource on a chip brings in high power consumption. Therefore power consumption has become one of the most important problems for the development of GPUs. This paper analyzes the energy consumption of parallel algorithms executed in GPUs and provides a method to evaluate the energy scalability for parallel algorithms. Then the parallel prefix sum is analyzed to illustrate the method for the energy conservation, and the energy scalability is experimentally evaluated using Sparse Matrix-Vector Multiply (SpMV). The results show that the optimal number of blocks, memory choice and task scheduling are the important keys to balance the performance and the energy consumption of GPUs.

引用

页码：77 / 90

页数：14

共 34 条

[1] THE INPUT OUTPUT COMPLEXITY OF SORTING AND RELATED PROBLEMS [J].

AGGARWAL, A ;

VITTER, JS .

COMMUNICATIONS OF THE ACM, 1988, 31 (09) :1116-1127

[2]

[Anonymous], 2008, NVIDIA Technical Report NVR-2008-004

[3]

[Anonymous], WORKSH POW AW COMP S

[4]

[Anonymous], 1990, SYNTHESIS PARALLEL A

[5]

[Anonymous], 1993, CMUCS93173

[6] An Adaptive Performance Modeling Tool for GPU Architectures [J].

Baghsorkhi, Sara S. ;

Delahaye, Matthieu ;

Patel, Sanjay J. ;

Gropp, William D. ;

Hwu, Wen-mei W. .

ACM SIGPLAN NOTICES, 2010, 45 (05) :105-114

[7]

Baskaran M.M., 2009, RC24704 IBM

[8]

Bender MichaelA., 2005, Proceedings of SPAA 2005, P228, DOI DOI 10.1145/1073970.1074009

[9] Automatic data structure selection and transformation for sparse matrix computations [J].

Bik, AJC ;

Wijshoff, HAG .

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1996, 7 (02) :109-126

[10] Sparse matrix solvers on the GPU:: Conjugate gradients and multigrid [J].

Bolz, J ;

Farmer, I ;

Grinspun, E ;

Schröder, P .

ACM TRANSACTIONS ON GRAPHICS, 2003, 22 (03) :917-924

← 1 2 3 4 →