HIGH-PERFORMANCE SOFTWARE COHERENCE FOR CURRENT AND FUTURE ARCHITECTURES

被引:2
作者
KONTOTHANASSIS, LI
SCOTT, ML
机构
[1] Department of Computer Science, University of Rochester, Rochester
关键词
D O I
10.1006/jpdc.1995.1116
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Shared memory provides an attractive and intuitive programming model for large-scale parallel computing, but requires a coherence mechanism to allow caching for performance while ensuring that processors do not use stale data in their computation. Implementation options range from distributed shared memory emulations on networks of workstations to tightly coupled fully cache-coherent distributed shared memory multiprocessors. Previous work indicates that performance varies dramatically from one end of this spectrum to the other. Hardware cache coherence is fast, but also costly and time-consuming to design and implement, while DSM systems provide acceptable performance on only a limit class of applications. We claim that an intermediate hardware option-memory-mapped network interfaces that support a global physical address space, without cache coherence-can provide most of the performance benefits of fully cache-coherent hardware, at a fraction of the cost. To support this claim we present a software coherence protocol that runs on this class of machines, and use simulation to conduct a performance study. We look at both programming and architectural issues in the context of software and hardware coherence protocols. Our results suggest that software coherence on NCC-NUMA machines in a more cost-effective approach to large-scale shared-memory multiprocessing than either pure distributed shared memory or hardware cache coherence. (C) 1995 academic Press, Inc.
引用
收藏
页码:179 / 195
页数:17
相关论文
共 34 条
[1]  
AGARWAL A, 1995, 22ND P INT S COMP AR
[2]   CACHE COHERENCE PROTOCOLS - EVALUATION USING A MULTIPROCESSOR SIMULATION-MODEL [J].
ARCHIBALD, J ;
BAER, JL .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1986, 4 (04) :273-298
[3]  
Bailey D., 1991, RNR91002 NASA AM RES
[4]  
BLUMRICH MA, 1994, 21 INT S COMP ARCH A, P142
[5]  
BOLOSKY WJ, 1991, 4TH P INT C ARCH SUP, P212
[6]  
CARTER JB, 1991, 13TH P ACM S OP SYST, P152
[7]  
CHEN Y, 1992, P SUPERCOMPUTING 92
[8]   COMPILER-DIRECTED CACHE MANAGEMENT IN MULTIPROCESSORS [J].
CHEONG, H ;
VEIDENBAUM, AV .
COMPUTER, 1990, 23 (06) :39-47
[9]  
CIERNIAK M, 1995, JUN P SIGPLAN 95 C P
[10]  
COX A, 1994, 21ST P INT S COMP AR