HAPPY: Hybrid Address-based Page Policy in DRAMs

被引：9

作者：

Ghasempour, Mohsen ^{[1
]}

Jaleel, Aamer ^{[2
]}

Garside, Jim D. ^{[1
]}

Lujan, Mikel ^{[1
]}

机构：

[1] Univ Manchester, Sch Comp Sci, Manchester M13 9PL, Lancs, England

[2] NVIDIA, Westford, MA USA

来源：

MEMSYS 2016: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS | 2016年

基金：

英国工程与自然科学研究理事会;

关键词：

Memory Systems; DRAM; Page Closure Policy;

D O I：

10.1145/2989081.2989101

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Memory controllers have used static page closure policies to decide whether a row should be left open, open-page policy, or closed immediately, close-page policy, after the row has been accessed. The appropriate choice for a particular access can reduce the average memory latency. However, since application access patterns change at run time, static page policies cannot guarantee to deliver optimum execution time. Hybrid page policies have been investigated as a means of covering these dynamic scenarios and are now implemented in state-of-the-art processors. Hybrid page policies switch between open-page and close-page policies while the application is running, by monitoring the access pattern of row hits/conflicts and predicting future behavior. Unfortunately, as the size of DRAM memory increases, fine-grain tracking and analysis of memory access patterns does not remain practical. We propose a compact memory address-based encoding technique which can improve or maintain the performance of DRAMs page closure predictors while reducing the hardware overhead in comparison with state-of-the-art techniques. As a case study, we integrate our technique, HAPPY, with a state-of-the-art Intel-adaptive monitor (e.g. part of the Intel Xeon X5650) and a traditional Hybrid page policy. We evaluate them across 70 memory intensive workload mixes consisting of single-thread and multi-thread applications. The experimental results show that using the HAPPY encoding applied to the Intel-adaptive page closure policy can reduce the hardware overhead by 5x for the evaluated 64 GB memory (up to 40x for a 512 GB memory) while maintaining the prediction accuracy.

引用

页码：311 / 321

页数：11

共 27 条

[1] BioBench: A benchmark suite of bioinformatics applications [J].

Albayraktaroglu, K ;

Jaleel, A ;

Wu, X ;

Franklin, M ;

Jacob, B ;

Tseng, CW ;

Yeung, D .

ISPASS 2005: IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, 2005, :2-9

[2]

[Anonymous], RAMCLOUD

[3]

[Anonymous], 2010, ENCY MACHINE LEARNIN

[4]

[Anonymous], INT XEON PROC X5650

[5]

[Anonymous], 2010, ACM SIGOPS Operating Systems Review, DOI DOI 10.1145/1713254.1713276

[6]

Awasthi M., 2011, Proceedings 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT), P183, DOI 10.1109/PACT.2011.31

[7] The PARSEC Benchmark Suite: Characterization and Architectural Implications [J].

Bienia, Christian ;

Kumar, Sanjeev ;

Singh, Jaswinder Pal ;

Li, Kai .

PACT'08: PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2008, :72-81

[8]

Blackmore M., 2013, NOTES, V2013

[9]

Chatterjee Niladrish, 2012, Tech. Rep

[10] THE SPEC BENCHMARKS [J].

DIXIT, KM .

PARALLEL COMPUTING, 1991, 17 (10-11) :1195-1209

← 1 2 3 →