Compiler support for array distribution on NUMA shared memory multiprocessors

被引:2
|
作者
Abdelrahman, TS [1 ]
Wong, TN [1 ]
机构
[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON M5S 1A4, Canada
来源
JOURNAL OF SUPERCOMPUTING | 1998年 / 12卷 / 04期
基金
加拿大自然科学与工程研究理事会;
关键词
data distribution; locality management; cache management; parallelizing compilers; NUMA multiprocessors;
D O I
10.1023/A:1008035807599
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Management of program data to improve data locality and reduce false sharing is critical for scaling performance on NUMA shared memory multiprocessors. We use HPF-like data decomposition directives to partition and place arrays in data-parallel applications on Hector, a shared-memory NUMA multiprocessor. We describe a compiler system for automating the partitioning and placement of arrays. The compiler exploits Hector's shared memory architecture to efficiently implement distributed arrays. Experimental results from a prototype implementation demonstrate the effectiveness of these techniques. They also demonstrate the magnitude of the performance improvement attainable when our compiler-based data management schemes are used instead of operating system data management policies; performance improves by up to a factor of 5.
引用
收藏
页码:349 / 371
页数:23
相关论文
共 50 条
  • [21] Memory conscious scheduling for cluster-based NUMA multiprocessors
    Koita, T
    Katayama, T
    Saisho, K
    Fukuda, A
    JOURNAL OF SUPERCOMPUTING, 2000, 16 (03): : 217 - 235
  • [22] Memory Conscious Scheduling for Cluster-based NUMA Multiprocessors
    Takahiro Koita
    Tetsuro Katayama
    Keizo Saisho
    Akira Fukuda
    The Journal of Supercomputing, 2000, 16 : 217 - 235
  • [23] AND OR PARALLELISM ON SHARED-MEMORY MULTIPROCESSORS
    GUPTA, G
    JAYARAMAN, B
    JOURNAL OF LOGIC PROGRAMMING, 1993, 17 (01): : 59 - 89
  • [24] SMALL SHARED-MEMORY MULTIPROCESSORS
    BASKETT, F
    HENNESSY, JL
    SCIENCE, 1986, 231 (4741) : 963 - 967
  • [25] Compiler optimization of implicit reductions for distributed memory multiprocessors
    Lu, B
    Mellor-Crummey, J
    FIRST MERGED INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM & SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, 1998, : 42 - 51
  • [26] Boosting the performance of shared memory multiprocessors
    Chalmers Univ of Technology
    Computer, 7 (63-70):
  • [27] EFFICIENT SYNCHRONIZATION ON MULTIPROCESSORS WITH SHARED MEMORY
    KRUSKAL, CP
    RUDOLPH, L
    SNIR, M
    ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1988, 10 (04): : 579 - 601
  • [28] Boosting the performance of shared memory multiprocessors
    Stenstrom, P
    Brorsson, M
    Dahlgren, F
    Grahn, H
    Dubois, M
    COMPUTER, 1997, 30 (07) : 63 - +
  • [29] SYNCHRONIZED EXECUTION ON SHARED MEMORY MULTIPROCESSORS
    FRANCIS, R
    MATHIESON, I
    PARALLEL COMPUTING, 1988, 8 (1-3) : 165 - 175
  • [30] Design alternatives for shared memory multiprocessors
    Carter, J
    Kuo, CC
    Kuramkote, R
    Swanson, M
    FIFTH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 1998, : 41 - 50