Optimizing array-intensive applications for on-chip multiprocessors

被引:7
作者
Kadayif, I
Kandemir, M
Chen, GL
Ozturk, O
Karakoy, M
Sezer, U
机构
[1] Penn State Univ, CSE Dept, University Pk, PA 16802 USA
[2] Univ London Imperial Coll Sci Technol & Med, Dept Comp, London SW7 2BZ, England
[3] Univ Wisconsin, ECE Dept, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
on-chip multiprocessor; constrained optimization; embedded systems; energy consumption; adaptive loop parallelization; integer linear programming;
D O I
10.1109/TPDS.2005.57
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With energy consumption becoming one of the first-class optimization parameters in computer system design, compilation techniques that consider performance and energy simultaneously are expected to play a central role. In particular, compiling a given application code under performance and energy constraints is becoming an important problem. In this paper, we focus on an on-chip multiprocessor architecture and present a set of code optimization strategies. We first evaluate an adaptive loop parallelization strategy (i.e., a strategy that allows each loop nest to execute using a different number of processors if doing so is beneficial) and measure the potential energy savings when unused processors during execution of a nested loop are shut down (i.e., placed into a power-down or sleep state). Our results show that shutting down unused processors can lead to as much as 67 percent energy savings at the expense of up to 17 percent performance loss in a set of array-intensive applications. To eliminate this performance penalty, we also discuss and evaluate a processor preactivation strategy based on compile-time analysis of nested loops. Based on our experiments, we conclude that an adaptive loop parallelization strategy combined with idle processor shut down and preactivation can be very effective in reducing energy consumption without increasing execution time. We then generalize our strategy and present an application parallelization strategy based on integer linear programming (ILP). Given an array-intensive application, our optimization strategy determines the number of processors to be used in executing each loop nest based on the objective function and additional compilation constraints provided by the user/programmer. Our initial experience with this constraint-based optimization strategy shows that it is very successful in optimizing array-intensive applications on on-chip multiprocessors under multiple energy and performance constraints.
引用
收藏
页码:396 / 411
页数:16
相关论文
共 52 条
  • [1] [Anonymous], P 7 SIAM C PAR PROC
  • [2] [Anonymous], 2001, DESIGN HIGH PERFORMA
  • [3] Bahar RI, 2001, ACM COMP AR, P218, DOI 10.1109/ISCA.2001.937451
  • [4] Banerjee Utpal, 1994, Loop parallelization
  • [5] Extending lifetime of portable systems by battery scheduling
    Benini, L
    Castelli, G
    Macii, A
    Macii, E
    Poncino, M
    Scarsi, R
    [J]. DESIGN, AUTOMATION AND TEST IN EUROPE, CONFERENCE AND EXHIBITION 2001, PROCEEDINGS, 2001, : 197 - 201
  • [6] System-level power optimization: Techniques and tools
    Benini, L
    De Micheli, G
    [J]. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2000, 5 (02) : 115 - 192
  • [7] BERRY M, 1988, INT J SUPER COMPUTER
  • [8] BODIN F, 1998, RR3346 INRIA
  • [9] BUTTS JA, 2000, P INT S MICR DEC
  • [10] CARRIERO N, 1993, 954 YAL U