Compiler directed parallelization of loops in scale for shared-memory multiprocessors

被引:0
|
作者
Johnson, GS [1 ]
Sethumadhavan, S
机构
[1] Univ Texas, Dept Comp Sci, Austin, TX 78712 USA
[2] Univ Texas, Texas Adv Comp Ctr, Austin, TX 78712 USA
来源
COMPUTATIONAL SCIENCE - ICCS 2003, PT III, PROCEEDINGS | 2003年 / 2659卷
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Effective utilization of symmetric shared-memory multiprocessors (SMPs) is predicated on the development of efficient parallel code. Unfortunately, efficient parallelism is not always easy for the programmer to identify. Worse, exploiting such parallelism may directly conflict with optimizations affecting per-processor utilization (i.e. loop reordering to improve data locality). Here, we present our experience with a loop-level parallel compiler optimization for SMPs proposed by McKinley [6]. The algorithm uses dependence analysis and a simple model of the target machine, to transform nested loops. The goal of the approach is to promote efficient execution of parallel loops by exposing sources of large-grain parallel work while maintaining per-processor locality. We implement the optimization within the Scale compiler framework, and analyze the performance of multiprocessor code produced for three microbenchmarks.
引用
收藏
页码:946 / 955
页数:10
相关论文
共 50 条
  • [21] Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors
    Aliaga, Jose I.
    Bollhoefer, Matthias
    Martin, Alberto F.
    Quintana-Orti, Enrique S.
    PARALLEL COMPUTING: ARCHITECTURES, ALGORITHMS AND APPLICATIONS, 2008, 15 : 287 - +
  • [22] Parallelization of NAS benchmarks for shared memory multiprocessors
    Waheed, A
    Yan, J
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING-THEORY METHODS AND APPLICATIONS, 1999, 15 (03): : 353 - 363
  • [23] A PARALLEL LINKED LIST FOR SHARED-MEMORY MULTIPROCESSORS
    TANG, PY
    YEW, PC
    ZHU, CQ
    PROCEEDINGS : THE THIRTEENTH ANNUAL INTERNATIONAL COMPUTER SOFTWARE & APPLICATIONS CONFERENCE, 1989, : 130 - 135
  • [24] Parallelization of NAS benchmarks for shared memory multiprocessors
    Waheed, A
    Yan, J
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1998, 1401 : 377 - 386
  • [25] SEQUENTIAL HARDWARE PREFETCHING IN SHARED-MEMORY MULTIPROCESSORS
    DAHLGREN, F
    DUBOIS, M
    STENSTROM, P
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1995, 6 (07) : 733 - 746
  • [26] CACHE INVALIDATION PATTERNS IN SHARED-MEMORY MULTIPROCESSORS
    GUPTA, A
    WEBER, WD
    IEEE TRANSACTIONS ON COMPUTERS, 1992, 41 (07) : 794 - 810
  • [27] Conservative circuit simulation on shared-memory multiprocessors
    Keller, J
    Rauber, T
    Rederlechner, B
    TENTH WORKSHOP ON PARALLEL AND DISTRIBUTED SIMULATION - PADS 96, PROCEEDINGS, 1996, : 126 - 134
  • [28] FILTERED BACK PROJECTION ON SHARED-MEMORY MULTIPROCESSORS
    ZAPATA, EL
    CARAZO, JM
    BENAVIDES, JI
    WALTHER, S
    PESKIN, R
    ULTRAMICROSCOPY, 1990, 34 (04) : 271 - 282
  • [29] SCALABLE CACHE COHERENCE FOR SHARED-MEMORY MULTIPROCESSORS
    THAPAR, M
    DELAGI, BA
    FLYNN, MJ
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 591 : 1 - 12
  • [30] Shared-memory multiprocessors: SW or HW support?
    Scott, S
    THIRD INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE - PROCEEDINGS, 1997, : 140 - 140