Prefetch mechanism in compiler-assisted S-DSM system

被引:0
作者
Niwa, J [1 ]
机构
[1] Univ Tokyo, Grad Sch Sci, Dept Astron, Bunkyo Ku, Tokyo 1130032, Japan
来源
2004 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS, PROCEEDINGS | 2004年
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Software Distributed Shared Memory (S-DSM) provides shared address space at run-time and accepts a wide range of applications on parallel computer systems with commodity hardware. S-DSM caches remote data in the local memory in order to reduce remote-memory-access latency. This paper proposes the methods for further reducing remote-memory-access latency in S-DSM by utilizing an optimizing compiler that directly analyzes explicitly parallel shared-memory source programs. That is to say, this paper suggests the compiling techniques of issuing prefetch for remote-memory access and introduces the framework that enables prefetch mechanism. I have implemented this compiling technique in optimizing compiler, Remote Communication Optimizer :RCOP I also have implemented the lightweight run-time systems on PC cluster connected with the Gigabit Ether-net (1000BASE-T). The experimental results using the SPLASH-2 benchmark suite show that the prefetch technique is effective for applications with coarse-grained synchronization. In order to obtain high performance, it is necessary to choose appropriate framework according to the characteristics of applications and platforms.
引用
收藏
页码:520 / 529
页数:10
相关论文
共 25 条
  • [11] KELEHER P, 1994, PROCEEDINGS OF THE WINTER 1994 USENIX CONFERENCE, P115
  • [12] Keleher P., 1992, P 19 ANN INT S COMP, P13
  • [13] COMPILING GLOBAL NAME-SPACE PARALLEL LOOPS FOR DISTRIBUTED EXECUTION
    KOELBEL, C
    MEHROTRA, P
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1991, 2 (04) : 440 - 451
  • [14] LI K, 1988, P 1988 INT C PAR PRO, P94
  • [15] Memory-Based Communication Facilities and asymmetric Distributed Shared Memory
    Matsumoto, T
    Hiraki, K
    [J]. INNOVATIVE ARCHITECTURE FOR FUTURE GENERATION HIGH-PERFORMANCE PROCESSORS AND SYSTEMS, PROCEEDINGS, 1998, : 30 - 39
  • [16] MATSUMOTO T, 1998, P 1998 PDPTA JUL, V2, P875
  • [17] GLOBAL OPTIMIZATION BY SUPPRESSION OF PARTIAL REDUNDANCIES
    MOREL, E
    RENVOISE, C
    [J]. COMMUNICATIONS OF THE ACM, 1979, 22 (02) : 96 - 103
  • [18] MOWRY T, 1992, P ASPLOS 5 OCT
  • [19] NIWA J, 2000, THESIS TOKYO U
  • [20] NIWA J, 2000, P 2000 INT C SUP MAY, P284