Predicting inter-thread cache contention on a chip multi-processor architecture

被引:168
|
作者
Chandra, D [1 ]
Guo, F [1 ]
Kim, S [1 ]
Solihin, Y [1 ]
机构
[1] N Carolina State Univ, Dept Elect & Comp Engn, Raleigh, NC 27695 USA
关键词
D O I
10.1109/HPCA.2005.27
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper studies the impact of L2 cache sharing on threads that simultaneously share the cache, on a Chip Multi-Processor (CMP) architecture. Cache sharing impacts threads non-uniformly, where some threads may be slowed down significantly, while others are not. This may cause severe performance problems such as sub-optimal throughput, cache thrashing, and thread starvation for threads that fail to occupy sufficient cache space to make good progress. Unfortunately, there is no existing model that allows extensive investigation of the impact of cache sharing. To allow such a study, we propose three performance models that predict the impact of cache sharing on co-scheduled threads. The input to our models is the isolated L2 cache stack distance or circular sequence profile of each thread, which can be easily obtained on-line or off-line. The output of the models is the number of extra L2 cache misses for each thread due to cache sharing. The models differ by their complexity and prediction accuracy. We validate the models against a cycle-accurate simulation that implements a dual-core CMP architecture, on fourteen pairs of mostly SPEC benchmarks. The most accurate model, the Inductive Probability model, achieves an average error of only 3.9%. Finally, to demonstrate the usefulness and practicality of the model, a case study that details the relationship between an application's temporal reuse behavior and its cache sharing impact is presented.
引用
收藏
页码:340 / 351
页数:12
相关论文
共 50 条
  • [21] MULTI-PROCESSOR ARCHITECTURE FOR SIMULATION.
    McQuade, Michael R.
    Alford, Cecil O.
    Combustion and Flame, 1980, 5 (03) : 42 - 46
  • [22] A Reconfigurable Network-on-Chip Architecture for Optimal Multi-Processor SoC Communication
    Rana, Vincenzo
    Atienza, David
    Santambrogio, Marco Domenico
    Sciuto, Donatella
    De Micheli, Giovanni
    VLSI-SOC: DESIGN METHODOLOGIES FOR SOC AND SIP, 2010, 313 : 232 - +
  • [23] Dynamic Inter-Thread Vectorization Architecture: extracting DLP from TLP
    Kalathingal, Sajith
    Collange, Sylvain
    Swamy, Bharath N.
    Seznec, Andre
    PROCEEDINGS OF 28TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, (SBAC-PAD 2016), 2016, : 18 - 25
  • [24] A multi-thread approach reducing program execution time in a heterogeneous reconfigurable multi-processor architecture
    Thor, M
    JOURNAL OF SYSTEMS ARCHITECTURE, 1997, 43 (1-5) : 143 - 153
  • [25] Compositional, efficient caches for a chip multi-processor
    Molnos, A. M.
    Heijligers, M. J. M.
    Cotofana, S. D.
    van Eijndhoven, J. T. J.
    2006 DESIGN AUTOMATION AND TEST IN EUROPE, VOLS 1-3, PROCEEDINGS, 2006, : 343 - +
  • [26] ORB: An on-chip optical ring bus communication architecture for multi-processor systems-on-chip
    Pasricha, Sudeep
    Dutt, Nikil
    2008 ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2, 2008, : 771 - 776
  • [27] Asymmetric multi-processor architecture for reconfigurable system-on-chip and operating system abstractions
    Xie, Xin
    Williams, John
    Bergmann, Neil
    ICFPT 2007: INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY, PROCEEDINGS, 2007, : 41 - 48
  • [28] Workload and Variation Aware Thread Scheduling for Heterogeneous Multi-processor
    Lee, Seungwon
    Ro, Won Woo
    18TH IEEE INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS (ISCE 2014), 2014,
  • [29] MULTI-PROCESSOR ARCHITECTURE AND COMMUNICATIONS FOR PATIENT MONITORING
    RUETER, JM
    HEWLETT-PACKARD JOURNAL, 1980, 31 (11): : 15 - 18
  • [30] xpipes:: a latency insensitive parameterized network-on-chip architecture for multi-processor SoCs
    Dall'Osso, M
    Biccari, C
    Giovannini, L
    Bertozzi, D
    Benini, L
    21ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, PROCEEDINGS, 2003, : 536 - 539