Multiobjective GPU design space exploration optimization

被引:5
作者
Jooya, Ali [1 ]
Dimopoulos, Nikitas [1 ]
Baniasadi, Amirali [1 ]
机构
[1] Univ Victoria, Dept Elect & Comp Engn, Victoria, BC, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Multiobjective optimization; Design space exploration; GPGPU power and performance;
D O I
10.1016/j.micpro.2019.06.001
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
It has been more than a decade since general porous applications targeted GPUs to benefit from the enormous processing power they offer. However, not all applications gain speedup running on GPUs. If an application does not have enough parallel computation to hide memory latency, running it on a GPU will degrade the performance compared to what it could achieve on a CPU. On the other hand, the efficiency that an application with high level of parallelism can achieve running on a GPU depends on how well the application's memory and computational demands are balanced with a GPU's resources. In this work we tackle the problem of finding a GPU configuration that performs well on a set of GPGPU applications. To achieve this, we propose two models as follows. First, we study the design space of 20 GPGPU applications and show that the relationship between the architectural parameters of the GPU and the power and performance of the application it runs can be learned by a Neural Network (NN). We propose application-specific NN-based predictors that train with 5% of the design space and predict the power and performance of the remaining 95% configurations (blind set). Although the models make accurate predictions, there exist few configurations that their power and performance are mispredicted. We propose a filtering heuristic that captures most of the predictions with large errors by marking only 5% of the configurations in the blind set as outliers. Using the models and the filtering heuristic, one will have the power and performance values for all configurations in the design space of an application. Searching the design space for a set of configurations that meet certain restrictions on the power and performance can be a tedious task as some applications have large design spaces. In the Second model, we propose to employ the Pareto Front multiobjective optimization technique to obtain a subset of the design space that run the application optimally in terms of power and performance. We show that the optimum configurations predicted by our model is very close to the actual optimum configurations. While this method gives the optimum configurations for each application, having a set of GPGPU applications, one may look for a configuration that performs well over all the applications. Therefore, we propose a method to find such a configuration with respect to different performance objectives. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:198 / 210
页数:13
相关论文
共 35 条
  • [1] Al-Kiswany S., 2008, HPDC 08 P 17 INT S H, P165
  • [2] [Anonymous], 2011, P 18 INT C CONTROL S
  • [3] [Anonymous], 2009, P IEEE INT S WORKL C
  • [4] [Anonymous], 2009, PARALLEL DISTRIBUTED
  • [5] Bakhoda A, 2009, INT SYM PERFORM ANAL, P163, DOI 10.1109/ISPASS.2009.4919648
  • [6] Che SA, 2009, I S WORKL CHAR PROC, P44, DOI 10.1109/IISWC.2009.5306797
  • [7] Chen Jufang, 2011, Proceedings of the International Conference on Advanced Manufacturing Technology 2011 (ATDM 2011), P1, DOI 10.1049/cp.2011.1029
  • [8] Deb K, 2001, WIL INT S SYS OPT
  • [9] Giles M., 2008, Jacobi iteration for a laplace discretisation on a 3d structured grid
  • [10] Gosling J., 1995, QUICKSMART U GUIDES