COBAYN: Compiler Autotuning Framework Using Bayesian Networks

被引：57

作者：

Ashouri, Amir Hossein ^{[1
]}

Mariani, Giovanni ^{[2
]}

Palermo, Gianluca ^{[1
]}

Park, Eunjung ^{[3
]}

Cavazos, John ^{[4
]}

Silvano, Cristina ^{[1
]}

机构：

[1] Politecn Milan, Milan, Italy

[2] IBM Corp, North Castle, NY USA

[3] Los Alamos Natl Lab, Los Alamos, NM USA

[4] Univ Delaware, Dept Comp Sci, Newark, DE 19716 USA

来源：

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION | 2016年 / 13卷 / 02期

基金：

欧盟地平线“2020”;

关键词：

Bayesian networks; statistical inference; design space exploration; OPTIMIZATION;

D O I：

10.1145/2928270

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The variety of today's architectures forces programmers to spend a great deal of time porting and tuning application codes across different platforms. Compilers themselves need additional tuning, which has considerable complexity as the standard optimization levels, usually designed for the average case and the specific target architecture, often fail to bring the best results. This article proposes COBAYN: Compiler autotuning framework using BAYesian Networks, an approach for a compiler autotuning methodology using machine learning to speed up application performance and to reduce the cost of the compiler optimization phases. The proposed framework is based on the application characterization done dynamically by using independent microarchitecture features and Bayesian networks. The article also presents an evaluation based on using static analysis and hybrid feature collection approaches. In addition, the article compares Bayesian networks with respect to several state-of-the-art machine-learning models. Experiments were carried out on an ARM embedded platform and GCC compiler by considering two benchmark suites with 39 applications. The set of compiler configurations, selected by the model (less than 7% of the search space), demonstrated an application performance speedup of up to 4.6x on Polybench (1.85x on average) and 3.1x on cBench (1.54x on average) with respect to standard optimization levels. Moreover, the comparison of the proposed technique with (i) random iterative compilation, (ii) machine learning-based iterative compilation, and (iii) noniterative predictive modeling techniques shows, on average, 1.2x, 1.37x, and 1.48x speedup, respectively. Finally, the proposed method demonstrates 4x and 3x speedup, respectively, on cBench and Polybench in terms of exploration efficiency given the same quality of the solutions generated by the random iterative compilation model.

引用

页数：25

共 51 条

[1] Agakov F, 2006, INT SYM CODE GENER, P295
[2] [Anonymous], 2005, Journal of Embedded Computing
[3] [Anonymous], P 7 WORKSH PAR PROGR
[4] [Anonymous], 2015, SC
[5] [Anonymous], 2009, Positive definite matrices
[6] [Anonymous], 2012, Polybench: The Polyhedral Benchmark Suite
[7] OpenTuner: An Extensible Framework for Program Autotuning
Ansel, Jason
Kamil, Shoaib
Veeramachaneni, Kalyan
Ragan-Kelley, Jonathan
Bosboom, Jeffrey
O'Reilly, Una-May
Amarasinghe, Saman
[J]. PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, : 303 - 315
[8] PetaBricks: A Language and Compiler for Algorithmic Choice
Ansel, Jason
Chan, Cy
Wong, Yee Lok
Olszewski, Marek
Zhao, Qin
Edelman, Alan
Amarasinghe, Saman
[J]. PLDI'09 PROCEEDINGS OF THE 2009 ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION, 2009, : 38 - 49
[9] Ashouri AH, 2014, IEEE SYM EMBED SYST, P90, DOI 10.1109/ESTIMedia.2014.6962349
[10] Ashouri AH, 2013, IEEE INT CONF VLSI, P124, DOI 10.1109/VLSI-SoC.2013.6673262

← 1 2 3 4 5 6 →