Benchmarking Optimization Algorithms for Auto-Tuning GPU Kernels

被引:4
作者
Schoonhoven, Richard Arnoud [1 ,2 ]
van Werkhoven, Ben [1 ,3 ]
Batenburg, Kees Joost [1 ,2 ]
机构
[1] Ctr Wiskunde & Informat, Computat Imaging Grp, NL-1098 XG Amsterdam, Netherlands
[2] Leiden Univ, Leiden Inst Adv Comp Sci, NL-2311 EZ Leiden, Netherlands
[3] Netherlands eSci Ctr, NL-1098 XH Amsterdam, Netherlands
基金
荷兰研究理事会;
关键词
Auto-tuning; evolutionary computing; fitness landscape analysis; graphical processing unit (GPU) computing; performance optimization; GLOBAL OPTIMIZATION; SEARCH; IMPLEMENTATION; MODELS;
D O I
10.1109/TEVC.2022.3210654
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent years have witnessed phenomenal growth in the application, and capabilities of graphical processing units (GPUs) due to their high parallel computation power at relatively low cost. However, writing a computationally efficient GPU program (kernel) is challenging and, generally, only certain specific kernel configurations lead to significant increases in performance. Auto-tuning is the process of automatically optimizing software for highly efficient execution on a target hardware platform. Auto-tuning is particularly useful for GPU programming, as a single kernel requires retuning after code changes, for different input data, and for different architectures. However, the discrete and nonconvex nature of the search space creates a challenging optimization problem. In this work, we investigate which algorithm produces the fastest kernels if the time-budget for the tuning task is varied. We conduct a survey by performing experiments on 26 different kernel spaces, from nine different GPUs, for 16 different evolutionary black-box optimization algorithms. We then analyze these results and introduce a novel metric based on the PageRank centrality concept as a tool for gaining insight into the difficulty of the optimization problem. We demonstrate that our metric correlates strongly with the observed tuning performance.
引用
收藏
页码:550 / 564
页数:15
相关论文
共 50 条
  • [41] An Auto-tuning LQR based on Correlation Analysis
    Huang, Xujiang
    Li, Pu
    IFAC PAPERSONLINE, 2020, 53 (02): : 7148 - 7153
  • [42] An Architecture for Flexible Auto-Tuning: The Periscope Tuning Framework 2.0
    Mijakovic, Robert
    Firbach, Michael
    Gerndt, Michael
    2016 2ND INTERNATIONAL CONFERENCE ON GREEN HIGH PERFORMANCE COMPUTING (ICGHPC), 2016,
  • [43] Development and evaluation of a PID auto-tuning controller
    Nascu, Ioan
    De Keyser, Robin
    Folea, Silviu
    Buzdugan, Tudor
    2006 IEEE-TTTC INTERNATIONAL CONFERENCE ON AUTOMATION, QUALITY AND TESTING, ROBOTICS, VOL 1, PROCEEDINGS, 2006, : 122 - +
  • [44] Auto-tuning of output predictive PI controller
    Lo, WL
    Rad, AB
    Tsang, KM
    ISA TRANSACTIONS, 1999, 38 (01) : 25 - 36
  • [45] Parallel GMRES Incomplete Orthogonalization Auto-Tuning
    Aquilanti, Pierre-Yves
    Petiton, Serge
    Calandra, Henri
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS), 2011, 4 : 2246 - 2256
  • [46] MATOG: Array Layout Auto-Tuning for CUDA
    Weber, Nicolas
    Goesele, Michael
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2017, 14 (03)
  • [47] Effect of Auto-Tuning on Serrated Flow Behavior
    Mohammed, S. M. A. K.
    Chen, D. L.
    METALS, 2019, 9 (08)
  • [48] A Robust Auto-tuning Scheme for PID Controllers
    Pandey, Sanjeev Kumar
    Veeranna, Kuruva
    Kumar, Bijender
    Deshmukh, K. U.
    IECON 2020: THE 46TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2020, : 47 - 52
  • [49] An improved auto-tuning scheme for PID controllers
    Dey, Chanchal
    Mudi, Rajani K.
    ISA TRANSACTIONS, 2009, 48 (04) : 396 - 409
  • [50] A History-Based Auto-Tuning Framework for Fast and High-Performance DNN Design on GPU
    Mu, Jiandong
    Wang, Mengdi
    Li, Lanbo
    Yang, Jun
    Lin, Wei
    Zhang, Wei
    PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,