SOS: A Software-Oriented Distributed Shared Cache Management Approach for Chip Multiprocessors

被引:9
作者
Jin, Lei [1 ]
Cho, Sangyeun [1 ]
机构
[1] Univ Pittsburgh, Dept Comp Sci, Pittsburgh, PA 15260 USA
来源
18TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS | 2009年
关键词
CMP; NUCA; OS; Page Coloring; Performance;
D O I
10.1109/PACT.2009.14
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a new software-oriented approach for managing the distributed shared L2 caches of a chip multiprocessor (CMP) for latency-oriented multithreaded applications. The conventional shared cache scheme loses performance due to the blind distribution of data predominantly accessed by a single thread. SOS, our software-oriented distributed shared cache management approach, infers a program's data affinity hints through a novel machine learning based analysis of its L2 cache access behavior. The OS utilizes the hints to guide proper data placement in the L2 cache with page coloring. The derived hints are independent of the program input and can be used for multiple runs. By off-loading the cache management task onto software, SOS deviates substantially from previously proposed hardware-based strategies and opens up a new opportunity for the CMP cache optimization. Our experimental results demonstrate that SOS is very effective in reducing the number of remote cache accesses. By using the hints for guiding page coloring alone, SOS achieves an average speedup of 10% and up to 23% over the shared cache scheme. When hints are used to direct data replication, SOS secures an additional performance gain of 9%, performing 19% better than the shared cache scheme on average.
引用
收藏
页码:361 / 371
页数:11
相关论文
共 34 条
[1]  
*AMD, AMD DUAL COR PROC
[2]  
[Anonymous], INT AT PROC
[3]  
[Anonymous], 2008, 2008 IEEE International Solid-State Circuits Conference-Digest of Technical Papers, DOI DOI 10.1109/ISSCC.2008.4523070
[4]  
Beckmann BM, 2006, INT SYMP MICROARCH, P443
[5]  
Bienia Christian, 2008, P PACT OCT
[6]  
BORKAR S, 2005, TECH INTEL MAG MAR
[7]  
CHANG J, 2006, P ISCA JUN
[8]   Optimizing replication, communication, and capacity allocation in CMPs [J].
Chishti, Z ;
Powell, MD ;
Vijaykumar, TN .
32ND INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 2005, :357-368
[9]  
Cho SY, 2006, INT SYMP MICROARCH, P455
[10]   DEMONSTRATION OF AUTOMATIC DATA PARTITIONING TECHNIQUES FOR PARALLELIZING COMPILERS ON MULTICOMPUTERS [J].
GUPTA, M ;
BANERJEE, P .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1992, 3 (02) :179-193