Statistical and machine learning models for optimizing energy in parallel applications

被引:4
作者
Endrei, Mark [1 ,2 ]
Jin, Chao [1 ,2 ]
Minh Ngoc Dinh [1 ,2 ]
Abramson, David [1 ,2 ]
Poxon, Heidi [3 ]
DeRose, Luiz [4 ]
de Supinski, Bronis R. [5 ]
机构
[1] Univ Queensland, Res Comp Ctr, Brisbane, Qld 4072, Australia
[2] Univ Queensland, Sch ITEE, Brisbane, Qld 4072, Australia
[3] Cray Inc, Programming Environm Grp, Bloomington, MN USA
[4] Cray Inc, Bloomington, MN USA
[5] Lawrence Livermore Natl Lab, LC, Livermore, CA 94550 USA
基金
澳大利亚研究理事会;
关键词
Energy efficiency; performance; regression modeling; machine learning; high performance computing;
D O I
10.1177/1094342019842915
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Rising power costs and constraints are driving a growing focus on the energy efficiency of high performance computing systems. The unique characteristics of a particular system and workload and their effect on performance and energy efficiency are typically difficult for application users to assess and to control. Settings for optimum performance and energy efficiency can also diverge, so we need to identify trade-off options that guide a suitable balance between energy use and performance. We present statistical and machine learning models that only require a small number of runs to make accurate Pareto-optimal trade-off predictions using parameters that users can control. We study model training and validation using several parallel kernels and more complex workloads, including Algebraic Multigrid (AMG), Large-scale Atomic Molecular Massively Parallel Simulator, and Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. We demonstrate that we can train the models using as few as 12 runs, with prediction error of less than 10%. Our AMG results identify trade-off options that provide up to 45% improvement in energy efficiency for around 10% performance loss. We reduce the sample measurement time required for AMG by 90%, from 13 h to 74 min.
引用
收藏
页码:1079 / 1097
页数:19
相关论文
共 49 条
[1]  
Abadi M., 2015, P 12 USENIX S OPERAT
[2]  
Abramson D., 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing (Cat. No.95TB8075), P112, DOI 10.1109/HPDC.1995.518701
[3]  
[Anonymous], 2011, Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers, HiPEAC '11, DOI [10.1145/1944862.1944880, DOI 10.1145/1944862.1944880]
[4]   OpenTuner: An Extensible Framework for Program Autotuning [J].
Ansel, Jason ;
Kamil, Shoaib ;
Veeramachaneni, Kalyan ;
Ragan-Kelley, Jonathan ;
Bosboom, Jeffrey ;
O'Reilly, Una-May ;
Amarasinghe, Saman .
PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, :303-315
[5]   Finding the Limits of Power-Constrained Application Performance [J].
Bailey, Peter E. ;
Marathe, Aniruddha ;
Lowenthal, David K. ;
Rountree, Barry ;
Schulz, Martin .
PROCEEDINGS OF SC15: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2015,
[6]   Multi Objective Optimization of HPC Kernels for Performance, Power, and Energy [J].
Balaprakash, Prasanna ;
Tiwari, Ananta ;
Wild, Stefan M. .
HIGH PERFORMANCE COMPUTING SYSTEMS: PERFORMANCE MODELING, BENCHMARKING AND SIMULATION, 2014, 8551 :239-260
[7]  
Barnes BJ, 2008, ICS'08: PROCEEDINGS OF THE 2008 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, P368
[8]  
Carroll R. J., 2006, Measurement Error in Nonlinear Models: A Modern Perspective
[9]   On the Interplay of Parallelization, Program Performance, and Energy Consumption [J].
Cho, Sangyeun ;
Melhem, Rami G. .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2010, 21 (03) :342-353
[10]   A roofline model of energy [J].
Choi, Jee Whan ;
Bedard, Daniel ;
Fowler, Robert ;
Vuduc, Richard .
IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, :661-672