Learning to Optimize Halide with Tree Search and Random Programs

被引:153
作者
Adams, Andrew [1 ]
Ma, Karima [2 ]
Anderson, Luke [3 ]
Baghdad, Riyadh [3 ]
Li, Tzu-Mao [3 ]
Gharbi, Michael [4 ]
Steiner, Benoit [1 ]
Johnson, Steven [5 ]
Fatahalian, Kayvon [6 ]
Durand, Fredo [3 ]
Ragan-Kelley, Jonathan [2 ]
机构
[1] Facebook AI Res, Menlo Pk, CA 94025 USA
[2] Univ Calif Berkeley, Berkeley, CA USA
[3] MIT CSAIL, Cambridge, MA USA
[4] Adobe, San Francisco, CA USA
[5] Google, Mountain View, CA USA
[6] Stanford Univ, Stanford, CA 94305 USA
来源
ACM TRANSACTIONS ON GRAPHICS | 2019年 / 38卷 / 04期
关键词
optimizing compilers; Halide;
D O I
10.1145/3306346.3322967
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a new algorithm to automatically schedule Halide programs for high-performance image processing and deep learning. We significantly improve upon the performance of previous methods, which considered a limited subset of schedules. We define a parameterization of possible schedules much larger than prior methods and use a variant of beam search to search over it. The search optimizes runtime predicted by a cost model based on a combination of new derived features and machine learning. We train the cost model by generating and featurizing hundreds of thousands of random programs and schedules. We show that this approach operates effectively with or without autotuning. It produces schedules which are on average almost twice as fast as the existing Halide autoscheduler without autotuning, or more than twice as fast with, and is the first automatic scheduling algorithm to significantly outperform human experts on average.
引用
收藏
页数:12
相关论文
共 29 条
[1]  
Abadi Martin, 2016, arXiv
[2]  
[Anonymous], 2008, GCC SUMMIT
[3]  
[Anonymous], INT R MATH KERN LIB
[4]  
[Anonymous], 2018, CORR
[5]   OpenTuner: An Extensible Framework for Program Autotuning [J].
Ansel, Jason ;
Kamil, Shoaib ;
Veeramachaneni, Kalyan ;
Ragan-Kelley, Jonathan ;
Bosboom, Jeffrey ;
O'Reilly, Una-May ;
Amarasinghe, Saman .
PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, :303-315
[6]   A Survey on Compiler Autotuning using Machine Learning [J].
Ashouri, Amir H. ;
Killian, William ;
Cavazos, John ;
Palermo, Gianluca ;
Silvano, Cristina .
ACM COMPUTING SURVEYS, 2019, 51 (05)
[7]   PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming [J].
Baghdadi, Riyadh ;
Beaugnon, Ulysse ;
Cohen, Albert ;
Grosser, Tobias ;
Kruse, Michael ;
Reddy, Chandan ;
Verdoolaege, Sven ;
Absar, Javed ;
van Haastregt, Sven ;
Kravets, Alexey ;
Lokhmotov, Anton ;
Betts, Adam ;
Donaldson, Alastair F. ;
Ketema, Jeroen ;
David, Robert ;
Hajiyev, Elnar .
2015 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION (PACT), 2015, :138-149
[8]   A Practical Automatic Polyhedral Parallelizer and Locality Optimizer [J].
Bondhugula, Uday ;
Hartono, Albert ;
Ramanujam, J. ;
Sadayappan, R. .
PLDI'08: PROCEEDINGS OF THE 2008 SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN & IMPLEMENTATION, 2008, :101-+
[9]   Bilateral Guided Upsampling [J].
Chen, Jiawen ;
Adams, Andrew ;
Wadhwa, Neal ;
Hasinoff, Samuel W. .
ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (06)
[10]  
Cummins Chris, 2017, PAR ARCH COMP TECHN