Evaluation Method of Synchronization for Shared-Memory On-Chip Many-Core Processor

被引:1
作者
Song, Fenglong [1 ]
Liu, Zhiyong [1 ]
Fan, Dongrui [1 ]
Huang, He [1 ]
Yuan, Nan [1 ]
Yu, Lei [1 ]
Zhang, Junchao [1 ]
机构
[1] Chinese Acad Sci, Key Lab Comp Syst & Architecture, Inst Comp Technol, Beijing, Peoples R China
来源
2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS, PROCEEDINGS | 2009年
关键词
many-core architecture; synchronization; hardware-supported; evaluation; micro-benchmark;
D O I
10.1109/ISPA.2009.6
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
On-chip many-core architecture is an emerging and promising computation platform. High speed on-chip communication and abundant chipped resources are two outstanding advantages of this architecture, which provide an opportunity to implement efficient synchronization scheme. The practical execution efficiency of synchronization scheme is critical to this platform. However, there are few researches on systematic evaluation method of choice synchronization schemes for on-chip many-core processors, and effect of dedicated hardware support in this context. So we focus on the evaluation method and criterion of synchronization scheme on the platform. Firstly, we present several criterions proper to on-chip many-core architecture, that is, absolute overhead of synchronization operation, the transferring time between different synchronization operations, overhead caused by load imbalance, and the network congestion caused by synchronization operation. Secondly, we illustrate how to design microbenchmarks which one dedicated to evaluate a performance criterion respectively. Finally, we implement these microbenchmarks and synchronization schemes on an on-chip many-core processor with shared level-two cache and AMD Opteron commercial chip multi-processor, respectively. And we analyze effect of dedicated hardware support. Results show that the most overhead of synchronization is caused by load imbalance and serialization on synchronization point. It also shows that synchronization scheme supported with dedicated hardware can improve its performance obviously for chipped many-core processor.
引用
收藏
页码:571 / 576
页数:6
相关论文
共 14 条
  • [1] Almasi G., 2003, SIGARCH COMPUT ARCHI, V31, P26
  • [2] [Anonymous], 2006, Tech. rep.
  • [3] JIANG D, 1998, P ACM SIGMETRICS98 P
  • [4] Kongetira P., 2005, Niagara: a 32-way multithreaded sparc processor
  • [5] KUMAR S, EVALUATING SYNCHRONI
  • [6] LIM GH, 1994, ARCHITECTURAL SUPPOR, P25
  • [7] MARTIN RP, 1997, P 24 ANN INT S COMP, P85
  • [8] ALGORITHMS FOR SCALABLE SYNCHRONIZATION ON SHARED-MEMORY MULTIPROCESSORS
    MELLORCRUMMEY, JM
    SCOTT, ML
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1991, 9 (01): : 21 - 65
  • [9] OLUKOTUN K, 1996, P 7 INT S ARC SUPP P
  • [10] SEILER L, 2008, P INT C COMP GRAPH I