moTuner: A Compiler-based Auto-tuning Approach for Mixed-precision Operators

被引:0
作者
Mo, Zewei [1 ]
Lin, Zejia [2 ]
Zhang, Xianwei [1 ]
Lu, Yutong [1 ]
机构
[1] Sun Yat Sen Univ, Guangzhou, Peoples R China
[2] Northwestern Polytech Univ, Xian, Peoples R China
来源
PROCEEDINGS OF THE 19TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2022 (CF 2022) | 2022年
基金
中国国家自然科学基金;
关键词
mixed-precision operator; auto-tuning; compiler; performance and accuracy; GPUs;
D O I
10.1145/3528416.3530231
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Arithmetic operators are now used in a wide spectrum of domains, including artificial intelligence, data analytics and scientific computing. Meanwhile, specialized hardware components to enable low-precision computing are increasingly deployed in GPUs and accelerators. Whereas promising to boost performance, accelerating the operators on the hardware necessitates manually tuning the mixed-precision knobs to balance the performance and accuracy, which can be extremely challenging in real practices. To address the issue, we present moTuner, an automatic framework for efficiently tuning mixed-precision operators. moTuner works on compiler-level to automatically enable the mixed-precision computation, without involving any manual modifications of source code and/or the operator library, thus significantly alleviating the programming burden. Owing to be implemented in compilation phase, moTuner can be more widely applicable with lessened efforts on the libraries. Further, moTuner adopts optimized search strategy in tuning to effectively narrow down the configuration space. The evaluations on GEMM operators and real applications demonstrate that moTuner achieves performance improvement up to 3.13x and 1.15x respectively, while guaranteeing considerably high accuracy.
引用
收藏
页码:94 / 102
页数:9
相关论文
共 41 条
  • [1] AMD, 2021, AMD INST M100 ACC
  • [2] AMD, 2021, AMD ROCBLAS LIB
  • [3] [Anonymous], 2008, cuBLAS Library
  • [4] Error Analysis and Improving the Accuracy of Winograd Convolution for Deep Neural Networks
    Barabasz, Barbara
    Anderson, Andrew
    Soodhalter, Kirk M.
    Gregg, David
    [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2020, 46 (04):
  • [5] UNIQ: Uniform Noise Injection for Non-Uniform Quantization of Neural Networks
    Baskin, Chaim
    Liss, Natan
    Schwartz, Eli
    Zheltonozhskii, Evgenii
    Giryes, Raja
    Bronstein, Alex M.
    Mendelson, Avi
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2021, 37 (1-4): : 1 - 4
  • [6] A Dynamic Program Analysis to find Floating-Point Accuracy Problems
    Benz, Florian
    Hildebrandt, Andreas
    Hack, Sebastian
    [J]. ACM SIGPLAN NOTICES, 2012, 47 (06) : 453 - 462
  • [7] Cambricon, 2021, CAMBR MLU ACC
  • [8] Rigorous Floating-Point Mixed-Precision Tuning
    Chiang, Wei-Fan
    Baranowski, Mark
    Briggs, Ian
    Solovyev, Alexey
    Gopalakrishnan, Ganesh
    Rakamaric, Zvonimir
    [J]. ACM SIGPLAN NOTICES, 2017, 52 (01) : 300 - 315
  • [9] Towards a Compiler for Reals
    Darulova, Eva
    Kuncak, Viktor
    [J]. ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2017, 39 (02):
  • [10] Sound Compilation of Reals
    Darulova, Eva
    Kuncak, Viktor
    [J]. ACM SIGPLAN NOTICES, 2014, 49 (01) : 235 - 248