HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural Networks

被引：2

作者：

Zhang, Zining ^{[1
,2
]}

He, Bingsheng ^{[1
]}

Zhang, Zhenjie ^{[3
]}

机构：

[1] Natl Univ Singapore, Sch Comp, Singapore, Singapore

[2] NUS, Ctr Trusted Internet & Community, Singapore, Singapore

[3] Neuron Mobil Pte Ltd, Singapore, Singapore

来源：

51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022 | 2022年

关键词：

neural network optimization; auto tuner; reinforcement learning;

D O I：

10.1145/3545008.3545020

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

To efficiently perform inference with neural networks, the underlying tensor programs require sufficient tuning efforts before being deployed into production environments. Usually, enormous tensor program candidates need to be sufficiently explored to find the one with the best performance. This is necessary to make the neural network products meet the high demand of real-world applications such as natural language processing, auto-driving, etc. Auto-schedulers are being developed to avoid the need for human intervention. However, due to the gigantic search space and lack of intelligent search guidance, current auto-schedulers require hours to days of tuning time to find the best-performing tensor program for the entire neural network. In this paper, we propose HARL, a reinforcement learning (RL) based auto-scheduler specifically designed for efficient tensor program exploration. HARL uses a hierarchical RL architecture in which learning-based decisions are made at all different levels of search granularity. It also automatically adjusts exploration configurations in real-time for faster performance convergence. As a result, HARL improves the tensor operator performance by 22% and the search speed by 4.3x compared to the state-of-the-art auto-scheduler. Inference performance and search speed are also significantly improved on end-to-end neural networks.

引用

页数：13

共 29 条

[1]

Aarts EHL, 1987, SIMULATED ANNEALING, P7, DOI DOI 10.1007/978-94-015-7744-1_2

[2] Learning to Optimize Halide with Tree Search and Random Programs [J].

Adams, Andrew ;

Ma, Karima ;

Anderson, Luke ;

Baghdad, Riyadh ;

Li, Tzu-Mao ;

Gharbi, Michael ;

Steiner, Benoit ;

Johnson, Steven ;

Fatahalian, Kayvon ;

Durand, Fredo ;

Ragan-Kelley, Jonathan .

ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (04)

[3]

[Anonymous], 2022, GEMM-Wolfram Language Documentation

[4]

[Anonymous], 2022, Auto-scheduler in TVM v0.8.0

[5]

[Anonymous], 2022, oneapi-src/oneDNN: oneAPI Deep Neural Network Library (oneDNN)

[6]

[Anonymous], 2022, PPO-PyTorch

[7]

Chen Ricky T. Q., 2018, Advances in Neural Information Processing Systems, V31

[8] XGBoost: A Scalable Tree Boosting System [J].

Chen, Tianqi ;

Guestrin, Carlos .

KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794

[9]

Chetlur S, 2014, Arxiv, DOI arXiv:1410.0759

[10] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

← 1 2 3 →