Affine partitioning algorithm to maximize parallelism and minimize communication

被引:0
|
作者
Lim, Amy W. [1 ]
Cheong, Gerald I. [1 ]
Lam, Monica S. [1 ]
机构
[1] Stanford Univ, Stanford, CA, United States
关键词
Parallel algorithms - Program compilers;
D O I
暂无
中图分类号
学科分类号
摘要
An affine partitioning framework unifies many useful program transforms such as unimodular transformations (interchange, reversal, skewing), loop fusion, fission, scaling, reindexing, and statement reordering. This paper presents an algorithm, based on this unified framework, that maximizes parallelism while minimizing communication in programs with arbitrary loop nestings and affine data accesses. Our algorithm can find the optimal affine partition that maximizes the degree of parallelism with the minimum degree of synchronizations. In addition, it uses a greedy algorithm to minimize communication between loops heuristically by aligning the computation partitions for different loops, trading off excess degrees of parallelism, and choosing pipelined parallelism over doall parallelism if it can significantly reduce the communication. The algorithm is optimal in maximizing the degrees of parallelism that require (1) no communication, (2) near-neighbor communication and a constant number of synchronizations, and (3) near-neighbor communication and O(n) synchronizations where n is the number of iterations in a loop.
引用
收藏
页码:228 / 237
相关论文
共 50 条
  • [1] Maximize Parallelism Minimize Overhead for Nested Loops via Loop Striping
    Chun Xue
    Zili Shao
    Edwin H.-M. Sha
    The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, 2007, 47 : 153 - 167
  • [2] Maximize parallelism minimize overhead for nested loops via loop striping
    Xue, Chun
    Sha, Edwin H. -M.
    Shao, Zili
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2007, 47 (02): : 153 - 167
  • [3] A LOOP TRANSFORMATION THEORY AND AN ALGORITHM TO MAXIMIZE PARALLELISM
    WOLF, ME
    LAM, MS
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1991, 2 (04) : 452 - 471
  • [4] A code generation algorithm for affine partitioning framework
    Liao, SW
    Du, ZH
    Wu, GS
    Lueh, GY
    11th International Conference on Parallel and Distributed Systems Workshops, Vol II, Proceedings,, 2005, : 17 - 21
  • [5] An optimized three region partitioning technique to maximize parallelism of nested loops with non-uniform dependences
    Pean, DL
    Chen, C
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2001, 17 (03) : 463 - 489
  • [6] Maximize locality, minimize contention
    Sutter, Herb
    DR DOBBS JOURNAL, 2008, 33 (06): : 48 - +
  • [7] A parrallel algorithm for partitioning a point set to minimize the maximum of diameters
    Alsuwaiyel, MH
    INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-IV, PROCEEDINGS, 1998, : 698 - 701
  • [8] A parallel algorithm for partitioning a point set to minimize the maximum of diameters
    Alsuwaiyel, MH
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2001, 61 (05) : 662 - 666
  • [9] Tiling Stencil Computations to Maximize Parallelism
    Bandishti, Vinayaka
    Pananilath, Irshad
    Bondhugula, Uday
    2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
  • [10] A NOTE ON PARALLELISM IN AFFINE GEOMETRY
    SCHREIBER, P
    MATHEMATICAL LOGIC QUARTERLY, 1993, 39 (01) : 131 - 132