A Graph-Based Iterative Compiler Pass Selection and Phase Ordering Approach

被引:4
作者
Nobre, Ricardo [1 ,2 ]
Martins, Luiz G. A. [3 ]
Cardoso, Joao M. P. [1 ,2 ]
机构
[1] Univ Porto, Fac Engn, Oporto, Portugal
[2] INESC TEC, Oporto, Portugal
[3] Univ Fed Uberlandia, Fac Comp, Uberlandia, MG, Brazil
关键词
Algorithms; Measurement; Performance; Experimentation; Phase-ordering; design space exploration; compilers; OPTIMIZATION;
D O I
10.1145/2907950.2907959
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Nowadays compilers include tens or hundreds of optimization passes, which makes it difficult to find sequences of optimizations that achieve compiled code more optimized than the one obtained using typical compiler options such as -O2 and -O3. The problem involves both the selection of the compiler passes to use and their ordering in the compilation pipeline. The improvement achieved by the use of custom phase orders for each function can be significant, and thus important to satisfy strict requirements such as the ones present in high-performance embedded computing systems. In this paper we present a new and fast iterative approach to the phase selection and ordering challenges resulting in compiled code with higher performance than the one achieved with the standard optimization levels of the LLVM compiler. The obtained performance improvements are comparable with the ones achieved by other iterative approaches while requiring considerably less time and resources. Our approach is based on sampling over a graph representing transitions between compiler passes. We performed a number of experiments targeting the LEON3 microarchitecture using the Clang/LLVM 3.7 compiler, considering 140 LLVM passes and a set of 42 representative signal and image processing C functions. An exhaustive cross-validation shows our new exploration method is able to achieve a geometric mean performance speedup of 1.28x over the best individually selected -OX flag when considering 100,000 iterations; versus geometric mean speedups from 1.16x to 1.25x obtained with state-of-the-art iterative methods not using the graph. From the set of exploration methods tested, our new method is the only one consistently finding compiler sequences that result in performance improvements when considering 100 or less exploration iterations. Specifically, it achieved geometric mean speedups of 1.08x and 1.16x for 10 and 100 iterations, respectively.
引用
收藏
页码:21 / 30
页数:10
相关论文
共 23 条
[1]  
Agakov F, 2006, INT SYM CODE GENER, P295
[2]   Finding effective compilation sequences [J].
Almagor, L ;
Cooper, KD ;
Grosul, A ;
Harvey, TJ ;
Reeves, SW ;
Subramanian, D ;
Torczon, L ;
Waterman, T .
ACM SIGPLAN NOTICES, 2004, 39 (07) :231-239
[3]  
[Anonymous], 2008, TMS320C64x+ DSP little-endian library programmer's reference (rev. b)
[4]  
[Anonymous], 2012, INT J COMPUTER SCI I
[5]  
Ashouri AH, 2014, IEEE SYM EMBED SYST, P90, DOI 10.1109/ESTIMedia.2014.6962349
[6]  
Cardoso J.M. P., 2012, Proceedings of the 11th Annual International Conference on Aspect-oriented Software Development, AOSD '12, P179
[7]   Exploring the structure of the space of compilation sequences using randomized search algorithms [J].
Cooper, Keith D. ;
Grosul, Alexander ;
Harvey, Timothy J. ;
Reeves, Steve ;
Subramanian, Devika ;
Torczon, Linda ;
Waterman, Todd .
JOURNAL OF SUPERCOMPUTING, 2006, 36 (02) :135-151
[8]   Milepost GCC: Machine Learning Enabled Self-tuning Compiler [J].
Fursin, Grigori ;
Kashnikov, Yuriy ;
Memon, Abdul Wahid ;
Chamski, Zbigniew ;
Temam, Olivier ;
Namolaru, Mircea ;
Yom-Tov, Elad ;
Mendelson, Bilha ;
Zaks, Ayal ;
Courtois, Eric ;
Bodin, Francois ;
Barnard, Phil ;
Ashton, Elton ;
Bonilla, Edwin ;
Thomson, John ;
Williams, Christopher K. I. ;
O'Boyle, Michael .
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2011, 39 (03) :296-327
[9]  
Goldberg DE., 1989, GENETIC ALGORITHMS S, V1
[10]  
HUANG Q, 2013, IEEE 21 INT S FIELD, P89, DOI DOI 10.1109/FCCM.2013.50