Exploring performance and power properties of modern multi-core chips via simple machine models

被引:43
作者
Hager, Georg [1 ]
Treibig, Jan [1 ]
Habich, Johannes [1 ]
Wellein, Gerhard [1 ]
机构
[1] Erlangen Reg Comp Ctr RRZE, Martensstr 1, D-91058 Erlangen, Germany
关键词
multi-core; power modeling; performance modeling; ECM model; PARALLEL;
D O I
10.1002/cpe.3180
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Modern multi-core chips show complex behavior with respect to performance and power. Starting with the Intel Sandy Bridge processor, it has become possible to directly measure the power dissipation of a CPU chip and correlate this data with the performance properties of the running code. Going beyond a simple bottleneck analysis, we employ the recently published Execution-Cache-Memory (ECM) model to describe the single-core and multi-core performance of streaming kernels. The model refines the well-known roofline model, because it can predict the scaling and the saturation behavior of bandwidth-limited loop kernels on a multi-core chip. The saturation point is especially relevant for considerations of energy consumption. From power dissipation measurements of benchmark programs with vastly different requirements to the hardware, we derive a simple, phenomenological power model for the Sandy Bridge processor. Together with the ECM model, we are able to explain many peculiarities in the performance and power behavior of multi-core processors and derive guidelines for energy-efficient execution of parallel programs. Finally, we show that the ECM and power models can be successfully used to describe the scaling and power behavior of a lattice Boltzmann flow solver code. Copyright (c) 2013 John Wiley & Sons, Ltd.
引用
收藏
页码:189 / 210
页数:22
相关论文
共 26 条
[1]  
[Anonymous], 1995, IEEE Computer Society Technical Committee on Computer Architecture Newsletter
[2]  
[Anonymous], 2001, P 2001 ACM IEEE C SU
[3]  
[Anonymous], STREAM SUSTAINABLE M
[4]  
[Anonymous], 2012, INTEL 64 IA 32 ARCHI
[5]   Lattice Boltzmann method for fluid flows [J].
Chen, S ;
Doolen, GD .
ANNUAL REVIEW OF FLUID MECHANICS, 1998, 30 :329-364
[6]  
Hager G., 2010, INTRO HIGH PERFORMAN
[7]  
Hahnel Marcus, 2012, Performance Evaluation Review, V40, P13
[8]   Performance and scalability analysis of teraflop-scale parallel architectures using multidimensional wavefront applications [J].
Hoisie, A ;
Lubeck, O ;
Wasserman, H .
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2000, 14 (04) :330-346
[9]   Multi-mode Energy Management for Multi-tier Server Clusters [J].
Horvath, Tibor ;
Skadron, Kevin .
PACT'08: PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2008, :270-279
[10]   A performance model of the Parallel Ocean Program [J].
Kerbyson, DJ ;
Jones, PW .
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2005, 19 (03) :261-276