Heterogeneous Interconnect for Low-Power Snoop-Based Chip Multiprocessors

被引:0
作者
Shahidi, Narges [1 ]
Shafiee, Ali [1 ]
Baniasadi, Amirali [2 ]
机构
[1] Sharif Univ Technol, Dept Comp Engn, Tehran 111559517, Iran
[2] Univ Victoria, Dept Elect & Comp Engn, Engn Off Wing, Victoria, BC, Canada
关键词
Chip Multiprocessors; Snoop-Based Cache Coherency Protocols;
D O I
10.1166/jolpe.2012.1220
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this work we propose using heterogeneous interconnects in power-aware chip multiprocessors (also referred to as Helia). Helia improves energy efficiency in snoop-based chip multiprocessors as it eliminates unnecessary activities in both interconnect and cache. This is achieved by using innovative snoop filtering mechanisms coupled with wire management techniques. Our optimizations rely on the observation that a high percentage of cache mismatches could be detected by utilizing a small subset but highly informative portion of the tag bits. Helia comes in two variations: source-based (S-Helia) and destination-based (D-Helia). S-Helia relies on a global snapshot of remote caches collected in the snoop controller to detect possible remote tag mismatches. D-Helia, on the other hand, reduces power by storing and monitoring a low resolution snapshot of each remote cache at destination nodes. Power is reduced as (a) our wire management techniques permit slow transmission of a subset of tag bits while tag mismatches are being detected and (b) we avoid cache access for detected mismatches. Our evaluation shows that S-Helia reduces power in interconnect (dynamic: 50%, static: 50% to 55%) and cache tag array (dynamic: 32%, static: 50%) while improving average performance up to 4.4%. D-Helia, on the other hand, reduces power in interconnect (dynamic: 23%, static: 28%) and cache tag array (dynamic: 50%, static: 30%) while improving average performance up to 3.5%.
引用
收藏
页码:624 / 635
页数:12
相关论文
共 28 条
[1]  
Agarwal N, 2009, INT S HIGH PERF COMP, P67, DOI 10.1109/HPCA.2009.4798238
[2]  
Agrawal N., 2009, P INT S MICR NEW YOR, P232
[3]  
Atoofian E., 2007, JSA, V54, P507
[4]   Microarchitectural wire management for performance and power in partitioned architectures [J].
Balasubramonian, R ;
Muralimanohar, N ;
Ramani, K ;
Venkatachalapathy, V .
11TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2005, :28-39
[5]   Exploiting access semantics and program behavior to reduce snoop power in chip multiprocessors [J].
Ballapuram, Chinnakrishnan S. ;
Sharif, Ahmad ;
Lee, Hsien-Hsin S. .
ACM SIGPLAN NOTICES, 2008, 43 (03) :60-69
[6]   A power-optimal repeater insertion methodology for global interconnects in nanometer designs [J].
Banerjee, K ;
Mehrotra, A .
IEEE TRANSACTIONS ON ELECTRON DEVICES, 2002, 49 (11) :2001-2007
[7]  
Beckmann BM, 2003, 36TH INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, P43
[8]   SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS [J].
BLOOM, BH .
COMMUNICATIONS OF THE ACM, 1970, 13 (07) :422-&
[9]   Improving multiprocessor performance with coarse-grain coherence tracking [J].
Cantin, JF ;
Lipasti, MH ;
Smith, JE .
32nd International Symposium on Computer Architecture, Proceedings, 2005, :246-257
[10]  
CHANG YJ, 2002, CRPIT 02, P135