Latency, occupancy, and bandwidth in DSM multiprocessors: A performance evaluation

被引:10
作者
Chaudhuri, M [1 ]
Heinrich, M
Holt, C
Singh, JP
Rothberg, E
Hennessy, J
机构
[1] Cornell Univ, Comp Syst Lab, Ithaca, NY 14853 USA
[2] Transmeta Inc, Santa Clara, CA 95054 USA
[3] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
[4] ILOG Inc, Mountain View, CA 94043 USA
[5] Stanford Univ, Comp Syst Lab, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
occupancy; distributed shared memory multiprocessors; communication controller; latency; bandwidth; queuing model; flexible node controller;
D O I
10.1109/TC.2003.1214336
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
While the desire to use commodity parts in the communication architecture of a DSM multiprocessor offers advantages in cost and design time, the impact on application performance is unclear. We study this performance impact through detailed simulation, analytical modeling, and experiments on a flexible DSM prototype, using a range of parallel applications. We adapt the logP model to characterize the communication architectures of DSM machines. The l (network latency) and o (controller occupancy) parameters are the keys to performance in these machines, with the g (node-to-network bandwidth) parameter becoming important only for the fastest controllers. We show that, of all the logP parameters, controller occupancy has the greatest impact on application performance. Of the two contributions of occupancy to performance degradation-the latency it adds and the contention it induces-it is the contention component that governs performance regardless of network latency, showing a quadratic dependence on o. As expected, techniques to reduce the impact of latency make controller occupancy a greater bottleneck. Surprisingly, the performance impact of occupancy is substantial, even for highly-tuned applications and even in the absence of latency hiding techniques. Scaling the problem size is often used as a technique to overcome limitations in communication latency and bandwidth. Through experiments on a DSM prototype, we show that there are important classes of applications for which the performance lost by using higher occupancy controllers cannot be regained easily, if at all, by scaling the problem size.
引用
收藏
页码:862 / 880
页数:19
相关论文
共 34 条
  • [1] AGARWAL A, 1909, P 22 INT S COMP ARCH, pI
  • [2] BILAS A, 1997, P 1997 SUP C HIGH PE
  • [3] Blumrich M. A., 1994, Proceedings the 21st Annual International Symposium on Computer Architecture (Cat. No.94CH3397-7), P142, DOI 10.1109/ISCA.1994.288154
  • [4] CHAUDHURI M, 2002, LATENCY OCCUPACY BAN
  • [5] The sensitivity of communication mechanisms to bandwidth and latency
    Chong, FT
    Barua, R
    Dahlgren, F
    Kubiatowicz, JD
    Agarwal, A
    [J]. 1998 FOURTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 1998, : 37 - 46
  • [6] CULLER D, 1993, P 4 ACM SIGPLAN S PR, P1
  • [7] DAI D, 1996, OSUCISRC496TR21 DEP
  • [8] Spider: A high-speed network interconnect
    Galles, M
    [J]. IEEE MICRO, 1997, 17 (01) : 34 - 39
  • [9] Gharachorloo K, 2000, ACM SIGPLAN NOTICES, V35, P13, DOI 10.1145/384264.378997
  • [10] GIBSON J, 2000, P 9 INT C ARCH SUPP, P49