Thread Migration Prediction for Distributed Shared Caches

被引:5
|
作者
Shim, Keun Sup [1 ]
Lis, Mieszko [1 ]
Khan, Omer [2 ]
Devadas, Srinivas [1 ]
机构
[1] MIT, Cambridge, MA 02139 USA
[2] Univ Connecticut, Storrs, CT USA
关键词
Parallel Architecture; Distributed Caches; Cache Coherence; Data Locality;
D O I
10.1109/L-CA.2012.30
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Chip-multiprocessors (CMPs) have become the mainstream parallel architecture in recent years; for scalability reasons, designs with high core counts tend towards tiled CMPs with physically distributed shared caches. This naturally leads to a Non-Uniform Cache Access (NUCA) design, where on-chip access latencies depend on the physical distances between requesting cores and home cores where the data is cached. Improving data locality is thus key to performance, and several studies have addressed this problem using data replication and data migration. In this paper, we consider another mechanism, hardware-level thread migration. This approach, we argue, can better exploit shared data locality for NUCA designs by effectively replacing multiple round-trip remote cache accesses with a smaller number of migrations. High migration costs, however, make it crucial to use thread migrations judiciously; we therefore propose a novel, on-line prediction scheme which decides whether to perform a remote access (as in traditional NUCA designs) or to perform a thread migration at the instruction level. For a set of parallel benchmarks, our thread migration predictor improves the performance by 24% on average over the shared-NUCA design that only uses remote accesses.
引用
收藏
页码:53 / 56
页数:4
相关论文
共 50 条
  • [21] An efficient thread architecture for a distributed shared memory on symmetric multiprocessor clusters
    Chang, JB
    Tsai, YJ
    Shieh, CK
    Chung, PC
    1998 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 1998, : 816 - 823
  • [22] Lightweight transparent Java']Java thread migration for distributed JVM
    Zhu, WZ
    Wang, CL
    Lau, FCM
    2003 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDINGS, 2003, : 465 - 472
  • [23] Using remote access histories for thread scheduling in distributed shared memory systems
    Schuster, A
    Shalev, L
    DISTRIBUTED COMPUTING, 1998, 1499 : 347 - 362
  • [24] CLU: Co-Optimizing Locality and Utility in Thread-Aware Capacity Management for Shared Last Level Caches
    Zhan, Dongyuan
    Jiang, Hong
    Seth, Sharad C.
    IEEE TRANSACTIONS ON COMPUTERS, 2014, 63 (07) : 1656 - 1667
  • [25] Fast Thread Migration via Cache Working Set Prediction
    Brown, Jeffery A.
    Porter, Leo
    Tullsen, Dean M.
    2011 IEEE 17TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2011, : 193 - 204
  • [26] Performance of shared caches on multithreaded architectures
    Chen, YY
    Peir, JK
    King, CT
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 1998, 14 (02) : 499 - 514
  • [27] Process migration based on Gobelins distributed shared memory
    Vallée, G
    Morin, C
    Berthou, JY
    Malen, ID
    Lottiaux, R
    CCGRID 2002: 2ND IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, PROCEEDINGS, 2002, : 325 - 330
  • [28] Lazy some migration for distributed shared memory systems
    Baylor, S
    Ekanadham, K
    Jann, J
    Lim, BH
    Pattnaik, P
    FOURTH INTERNATIONAL CONFERENCE ON HIGH-PERFORMANCE COMPUTING, PROCEEDINGS, 1997, : 106 - 111
  • [29] Distributed Shared Memory based Live VM Migration
    Daradkeh, Tariq
    Agarwal, Anjali
    PROCEEDINGS OF 2016 IEEE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2016, : 826 - 830
  • [30] Decentralized Coded Caching for Shared Caches
    Dutta, Monolina
    Thomas, Anoop
    IEEE COMMUNICATIONS LETTERS, 2021, 25 (05) : 1458 - 1462