Near-Memory Address Translation

被引：21

作者：

Picorel, Javier ^{[1
]}

Jevdjic, Djordje ^{[2
]}

Falsafi, Babak ^{[1
]}

机构：

[1] Ecole Polytech Fed Lausanne, EcoCloud, Lausanne, Switzerland

[2] Microsoft Res, Redmond, WA USA

来源：

2017 26TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT) | 2017年

基金：

瑞士国家科学基金会;

关键词：

Virtual memory; address translation; near-memory processing; MMU; TLB; page table; DRAM; servers;

D O I：

10.1109/PACT.2017.56

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Memory and logic integration on the same chip is becoming increasingly cost effective, creating the opportunity to offload data-intensive functionality to processing units placed inside memory chips. The introduction of memory-side processing units (MPUs) into conventional systems faces virtual memory as the first big showstopper: without efficient hardware support for address translation MPUs have highly limited applicability. Unfortunately, conventional translation mechanisms fall short of providing fast translations as contemporary memories exceed the reach of TLBs, making expensive page walks common. In this paper, we are the first to show that the historically important flexibility to map any virtual page to any page frame is unnecessary in today's servers. We find that while limiting the associativity of the virtual-to-physical mapping incurs no penalty, it can break the translate-then-fetch serialization if combined with careful data placement in the MPU's memory, allowing for translation and data fetch to proceed independently and in parallel. We propose the Distributed Inverted Page Table (DIPTA), a near-memory structure in which the smallest memory partition keeps the translation information for its data share, ensuring that the translation completes together with the data fetch. DIPTA completely eliminates the performance overhead of translation, achieving speedups of up to 3.81x and 2.13x over conventional translation using 4KB and 1GB pages respectively.

引用

页码：303 / 317

页数：15

共 76 条

[1]

Ahn J., 2015, P 2015 INT S COMP AR

[2]

[Anonymous], 1970, ACM Computing Surveys, DOI DOI 10.1145/356571.356573

[3]

[Anonymous], 2008, Professional Linux Kernel Architecture

[4]

[Anonymous], 2009, NVIDIA's Next Gen CUDA Compute Architecture: Fermi. NVIDIA White Paper

[5]

[Anonymous], 2010, ACM SIGOPS Operating Systems Review, DOI DOI 10.1145/1713254.1713276

[6]

[Anonymous], HYBR MEM CUB SPEC 2

[7]

[Anonymous], 2013, HIGH BANDWIDTH MEMOR

[8]

Barr T. W., 2011, P 2011 INT S COMP AR

[9]

Barr T. W., 2010, P 2010 INT S COMP AR

[10] The case for energy-proportional computing [J].

Barroso, Luiz Andre ;

Hoelzle, Urs .

COMPUTER, 2007, 40 (12) :33-+

← 1 2 3 4 5 6 7 8 →