moTuner: A Compiler-based Auto-tuning Approach for Mixed-precision Operators

被引：0

作者：

Mo, Zewei ^{[1
]}

Lin, Zejia ^{[2
]}

Zhang, Xianwei ^{[1
]}

Lu, Yutong ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Guangzhou, Peoples R China

[2] Northwestern Polytech Univ, Xian, Peoples R China

来源：

PROCEEDINGS OF THE 19TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2022 (CF 2022) | 2022年

基金：

中国国家自然科学基金;

关键词：

mixed-precision operator; auto-tuning; compiler; performance and accuracy; GPUs;

D O I：

10.1145/3528416.3530231

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Arithmetic operators are now used in a wide spectrum of domains, including artificial intelligence, data analytics and scientific computing. Meanwhile, specialized hardware components to enable low-precision computing are increasingly deployed in GPUs and accelerators. Whereas promising to boost performance, accelerating the operators on the hardware necessitates manually tuning the mixed-precision knobs to balance the performance and accuracy, which can be extremely challenging in real practices. To address the issue, we present moTuner, an automatic framework for efficiently tuning mixed-precision operators. moTuner works on compiler-level to automatically enable the mixed-precision computation, without involving any manual modifications of source code and/or the operator library, thus significantly alleviating the programming burden. Owing to be implemented in compilation phase, moTuner can be more widely applicable with lessened efforts on the libraries. Further, moTuner adopts optimized search strategy in tuning to effectively narrow down the configuration space. The evaluations on GEMM operators and real applications demonstrate that moTuner achieves performance improvement up to 3.13x and 1.15x respectively, while guaranteeing considerably high accuracy.

引用

页码：94 / 102

页数：9

共 41 条

[1] AMD, 2021, AMD INST M100 ACC
[2] AMD, 2021, AMD ROCBLAS LIB
[3] [Anonymous], 2008, cuBLAS Library
[4] Error Analysis and Improving the Accuracy of Winograd Convolution for Deep Neural Networks
Barabasz, Barbara
Anderson, Andrew
Soodhalter, Kirk M.
Gregg, David
[J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2020, 46 (04):
[5] UNIQ: Uniform Noise Injection for Non-Uniform Quantization of Neural Networks
Baskin, Chaim
Liss, Natan
Schwartz, Eli
Zheltonozhskii, Evgenii
Giryes, Raja
Bronstein, Alex M.
Mendelson, Avi
[J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2021, 37 (1-4): : 1 - 4
[6] A Dynamic Program Analysis to find Floating-Point Accuracy Problems
Benz, Florian
Hildebrandt, Andreas
Hack, Sebastian
[J]. ACM SIGPLAN NOTICES, 2012, 47 (06) : 453 - 462
[7] Cambricon, 2021, CAMBR MLU ACC
[8] Rigorous Floating-Point Mixed-Precision Tuning
Chiang, Wei-Fan
Baranowski, Mark
Briggs, Ian
Solovyev, Alexey
Gopalakrishnan, Ganesh
Rakamaric, Zvonimir
[J]. ACM SIGPLAN NOTICES, 2017, 52 (01) : 300 - 315
[9] Towards a Compiler for Reals
Darulova, Eva
Kuncak, Viktor
[J]. ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2017, 39 (02):
[10] Sound Compilation of Reals
Darulova, Eva
Kuncak, Viktor
[J]. ACM SIGPLAN NOTICES, 2014, 49 (01) : 235 - 248

← 1 2 3 4 5 →