Adaptive Configuration Selection for Power-Constrained Heterogeneous Systems

被引:36
作者
Bailey, Peter E. [1 ]
Lowenthal, David K. [1 ]
Ravi, Vignesh [2 ]
Rountree, Barry [3 ]
Schulz, Martin [3 ]
de Supinski, Bronis R. [3 ]
机构
[1] Univ Arizona, Dept Comp Sci, Tucson, AZ 85721 USA
[2] Adv Micro Devices Inc, Sunnyvale, CA 94088 USA
[3] Lawrence Livermore Natl Lab, Livermore, CA USA
来源
2014 43RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP) | 2014年
基金
美国国家科学基金会;
关键词
PERFORMANCE; ADAPTATION;
D O I
10.1109/ICPP.2014.46
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
As power becomes an increasingly important design factor in high-end supercomputers, future systems will likely operate with power limitations significantly below their peak power specifications. These limitations will be enforced through a combination of software and hardware power policies, which will filter down from the system level to individual nodes. Hardware is already moving in this direction by providing power-capping interfaces to the user. The power/performance trade-off at the node level is critical in maximizing the performance of power-constrained cluster systems, but is also complex because of the many interacting architectural features and accelerators that comprise the hardware configuration of a node. The key to solving this challenge is an accurate power/performance model that will aid in selecting the right configuration from a large set of available configurations. In this paper, we present a novel approach to generate such a model offline using kernel clustering and multivariate linear regression. Our model requires only two iterations to select a configuration, which provides a significant advantage over exhaustive search-based strategies. We apply our model to predict power and performance for different applications using arbitrary configurations, and show that our model, when used with hardware frequency-limiting, selects configurations with significantly higher performance at a given power limit than those chosen by frequency-limiting alone. When applied to a set of 36 computational kernels from a range of applications, our model accurately predicts power and performance; it maintains 91% of optimal performance while meeting power constraints 88% of the time. When the model violates a power constraint, it exceeds the constraint by only 6% in the average case, while simultaneously achieving 54% more performance than an oracle.
引用
收藏
页码:371 / 380
页数:10
相关论文
共 43 条
[1]  
[Anonymous], 2006, P 20 ANN INT C SUP I
[2]  
[Anonymous], 2012, TECH REP
[3]  
[Anonymous], INT C PAR ARCH COMP
[4]  
[Anonymous], INT C SUP YORKT HEIG
[5]  
[Anonymous], 1979, LINPACK Users' Guide
[6]  
[Anonymous], 2001, ELEMENTS STAT LEARNI
[7]  
[Anonymous], 4 IEEE ACM WORKSH PO
[8]  
[Anonymous], SUPERCOMPUTING
[9]  
Bertran R., 2010, INT C SUP
[10]  
Beyer J. C., 2011, OPENMP PETASCALE ERA