Hardware-Software Co-design to Mitigate DRAM Refresh Overheads: A Case for Refresh-Aware Process Scheduling

被引:0
作者
Kotra, Jagadish B. [1 ]
Shahidi, Narges [1 ]
Chishti, Zeshan A. [2 ]
Kandemir, Mahmut T. [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
[2] Intel Labs, Hillsboro, OR 97124 USA
来源
TWENTY-SECOND INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXII) | 2017年
基金
美国国家科学基金会;
关键词
DRAM refresh; Operating Systems; Task Scheduling; Hardware-software co-design;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
DRAM cells need periodic refresh to maintain data integrity. With high capacity DRAMs, DRAM refresh poses a significant performance bottleneck as the number of rows to be refreshed (and hence the refresh cycle time, tRFC) for each refresh command increases. Modern day DRAMs perform refresh at a rank-level, while LPDDRs used in mobile environments support refresh at a per-bank level. Rank-level refresh degrades the performance significantly since none of the banks in a rank can serve the on-demand requests. Perbank refresh alleviates some of the performance bottlenecks as the other banks in a rank are available for on-demand requests. Typical DRAM retention time is in the order of several milliseconds, viz, 64msec for environments operating in temperatures below 85 deg C and 32msec for environments operating above 85 deg C. With systems moving towards increased consolidation (e.g., virtualized environments), DRAM refresh becomes a significant bottleneck as it reduces the available overall DRAM bandwidth per task. In this work, we propose a hardware-software co-design to mitigate DRAM refresh overheads by exposing the hardware address-mapping and DRAM refresh schedule to the operating system (OS). In our co-design, we propose a novel per-bank refresh schedule in the hardware which augments memory partitioning in the OS. Supported by the novel per-bank refresh schedule and memory-partitioning, we propose a refresh-aware process scheduling algorithm in the OS which schedules applications on cores such that none of the on-demand requests from the applications are stalled by refreshes. The evaluation of our proposed co-design using multi-programmed workloads from the SPEC CPU2006, STREAM and NAS suites show significant performance improvements compared to the previously proposed hardware-only approaches.
引用
收藏
页码:723 / 736
页数:14
相关论文
共 31 条
  • [1] [Anonymous], 2012, DDR4 SDRAM Standard
  • [2] [Anonymous], 2011, ACM SIGARCH COMPUT A
  • [3] [Anonymous], 2012, LOW POW DOUBL DAT RA
  • [4] [Anonymous], 2012, DDR3 SDRAM Standard JESD79-3F
  • [5] Bhati I., 2013, P 2013 INT S LOW POW
  • [6] Bhati I., 2015, P 42 ANN INT S COMP
  • [7] Booth Joshua Dennis, 2015, 2015 IEEE 35 INT C D
  • [8] Chang K. K. W., 2014, 20 INT S HIGH PERF C
  • [9] Chatterjee N., 2012, P 2012 IEEE 18 INT S
  • [10] Fedorov V. V., 2015, P ACM INT S MEMORY S, P113, DOI [10.1145/2818950.2818968, DOI 10.1145/2818950.2818968]