RHPM: Using Relative Hotness to Guide Page Migration for Hybrid Memory Systems

被引：5

作者：

Peng, Zhouxuan ^{[1
]}

Feng, Dan ^{[1
]}

Chen, Jianxi ^{[1
]}

Hu, Jing ^{[1
]}

Huang, Chuang ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Engn Res Ctr Data Storage Syst & Technol, Sch Comp Sci & Technol, Wuhan Natl Lab Optoelect,Key Lab Informat Storage, Wuhan 430074, Peoples R China

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2023年 / 42卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Data tiering; heterogeneous memory systems; nonvolatile memory (NVM); page migration policy; MAIN MEMORY; DRAM; PERFORMANCE;

D O I：

10.1109/TCAD.2022.3231836

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Modern computing systems and data-intensive applications are eager for larger and faster memory. Building hybrid memory systems with different memory technologies has become a dominant trend to satisfy these demands. For hybrid memory systems, page migration schemes that dynamically migrate frequently accessed hot pages into faster memory are crucial for improving performance. However, existing migration schemes are either too aggressive, resulting in unnecessary extra traffic, or too conservative to quickly adapt to changes in access patterns. Besides, the extra latency introduced by querying metadata is often ignored or handled in an unscalable manner. In this article, we propose a relative hotness page migration (RHPM) strategy, which discovers hot pages in a set of pages by competing with each other rather than comparing with a threshold. The migration is performed only when a new page wins the competition. To overlap latency due to access metadata, RHPM fetches metadata and data in parallel. In addition, it enables a small metadata buffer to speed up metadata access. Compared to the state-of-the-art scheme, RHPM requires only 1/512 of the on-chip capacity, significantly reducing on-chip hardware overhead. Evaluation of RHPM with simulations of 25 workloads shows that RHPM outperforms the state-of-the-art scheme by an average of 13.34% in performance and saves 44.19% on energy, demonstrating better resilience to changes in access patterns.

引用

页码：2514 / 2526

页数：13

共 43 条

[31]

Meswani MR, 2015, INT S HIGH PERF COMP, P126, DOI 10.1109/HPCA.2015.7056027

[32] A Software-managed Approach to Die-stacked DRAM [J].

Oskin, Mark ;

Loh, Gabriel H. .

2015 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION (PACT), 2015, :188-200

[33] MemPod: A Clustered Architecture for Efficient and Scalable Migration in Flat Address Space Multi-level Memories [J].

Prodromou, Andreas ;

Meswani, Mitesh ;

Jayasena, Nuwan ;

Loh, Gabriel ;

Tullsen, Dean M. .

2017 23RD IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2017, :433-444

[34] Fundamental Latency Trade-offs in Architecting DRAM Caches Outperforming Impractical SRAM-Tags with a Simple and Practical Design [J].

Qureshi, Moinuddin K. ;

Loh, Gabriel H. .

2012 IEEE/ACM 45TH INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-45), 2012, :235-246

[35]

Qureshi MK, 2009, CONF PROC INT SYMP C, P24, DOI 10.1145/1555815.1555760

[36]

Qureshi MoinuddinK., 2011, SYNTHESIS LECT COMPU, V6, P1

[37]

Ramos Luiz E., 2011, P INT C SUP ICS, P85

[38] SILC-FM: Subblocked InterLeaved Cache-Like Flat Memory Organization [J].

Ryoo, Jee Ho ;

Meswani, Mitesh R. ;

Prodromou, Andreas ;

John, Lizy K. .

2017 23RD IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2017, :349-360

[39] Transparent Hardware Management of Stacked DRAM as Part of Memory [J].

Sim, Jaewoong ;

Alameldeen, Alaa R. ;

Chishti, Zeshan ;

Wilkerson, Chris ;

Kim, Hyesoon .

2014 47TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2014, :13-24

[40]

Smullen CW IV, 2011, INT S HIGH PERF COMP, P50, DOI 10.1109/HPCA.2011.5749716

← 1 2 3 4 5 →