Hardware-Software Co-design to Mitigate DRAM Refresh Overheads: A Case for Refresh-Aware Process Scheduling

被引:0
作者
Kotra, Jagadish B. [1 ]
Shahidi, Narges [1 ]
Chishti, Zeshan A. [2 ]
Kandemir, Mahmut T. [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
[2] Intel Labs, Hillsboro, OR 97124 USA
基金
美国国家科学基金会;
关键词
DRAM refresh; Operating Systems; Task Scheduling; Hardware-software co-design;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
DRAM cells need periodic refresh to maintain data integrity. With high capacity DRAMs, DRAM refresh poses a significant performance bottleneck as the number of rows to be refreshed (and hence the refresh cycle time, tRFC) for each refresh command increases. Modern day DRAMs perform refresh at a rank-level, while LPDDRs used in mobile environments support refresh at a per-bank level. Rank-level refresh degrades the performance significantly since none of the banks in a rank can serve the on-demand requests. Per-bank refresh alleviates some of the performance bottlenecks as the other banks in a rank are available for on-demand requests. Typical DRAM retention time is in the order of several milliseconds, viz, 64msec for environments operating in temperatures below 85 deg C and 32msec for environments operating above 85 deg C. With systems moving towards increased consolidation (e.g., virtualized environments), DRAM refresh becomes a significant bottleneck as it reduces the available overall DRAM bandwidth per task. In this work, we propose a hardware-software co-design to mitigate DRAM refresh overheads by exposing the hardware address-mapping and DRAM refresh schedule to the operating system (OS). In our co-design, we propose a novel per-bank refresh schedule in the hardware which augments memory partitioning in the OS. Supported by the novel per-bank refresh schedule and memory-partitioning, we propose a refresh-aware process scheduling algorithm in the OS which schedules applications on cores such that none of the on-demand requests from the applications are stalled by refreshes. The evaluation of our proposed co-design using multi-programmed workloads from the SPEC CPU2006, STREAM and NAS suites show significant performance improvements compared to the previously proposed hardware-only approaches.
引用
收藏
页码:723 / 736
页数:14
相关论文
共 31 条
[1]  
[Anonymous], 2012, DDR4 SDRAM Standard
[2]  
[Anonymous], 2011, ACM SIGARCH COMPUT A
[3]  
[Anonymous], 2012, LOW POW DOUBL DAT RA
[4]  
[Anonymous], 2012, DDR3 SDRAM Standard JESD79-3F
[5]  
Bhati I., 2013, P 2013 INT S LOW POW
[6]  
Bhati I., 2015, P 42 ANN INT S COMP
[7]  
Booth Joshua Dennis, 2015, 2015 IEEE 35 INT C D
[8]  
Chang K. K. W., 2014, 20 INT S HIGH PERF C
[9]  
Chatterjee N., 2012, P 2012 IEEE 18 INT S
[10]  
Fedorov V. V., 2015, P ACM INT S MEMORY S, P113, DOI [10.1145/2818950.2818968, DOI 10.1145/2818950.2818968]