Early-Adaptor: An Adaptive Framework for Proactive UVM Memory Management

被引：3

作者：

Go, Seokjin ^{[1
]}

Lee, Hyunwuk ^{[1
]}

Kim, Junsung ^{[1
]}

Lee, Jiwon ^{[1
]}

Yoon, Myung Kuk ^{[2
]}

Ro, Won Woo ^{[1
]}

机构：

[1] Yonsei Univ, Sch Elect & Elect Engn, Seoul, South Korea

[2] Ewha Womans Univ, Dept Comp Sci & Engn, Seoul, South Korea

来源：

2023 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, ISPASS | 2023年

基金：

新加坡国家研究基金会;

关键词：

GPGPU; Unified Virtual Memory; prefetching; memory management;

D O I：

10.1109/ISPASS57527.2023.00032

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Unified Virtual Memory (UVM) relieves programmers of the burden of memory management between CPU and GPUs. However, the use of UVM can lead to performance degradation due to its on-demand page migration scheme, especially under memory oversubscription. In this research, we conduct various analyses on real hardware, NVIDIA RTX 3090, to examine such performance degradation with an NVIDIA opensource GPU driver. Our analysis shows that the effectiveness of prefetching highly correlates with the relative number of page faults on a group of contiguous pages, which NVIDIA refers to as a Virtual Address Block (VABlock) spanning across a 2MB virtual address range. Also, the risk of page thrashing is determined by the total number of VABlocks that consistently generate page faults during kernel execution. Hence, the performance impact of the prefetch threshold varies across different workloads. These observations indicate that an adaptive prefetching scheme can resolve the performance bottleneck of memory oversubscription. To this end, we propose the Early-Adaptor (EA) framework, which automatically controls the prefetching aggressiveness based on the page fault history. During runtime, the EA framework monitors patterns of page faults in per-VABlock and in a global scope. After analyzing page fault generation rates and the possibility of page thrashing, the EA framework dynamically controls the prefetching aggressiveness by changing the prefetch threshold. The EA framework requires only minor changes to GPU drivers and needs no changes to the GPU hardware. Experiments on real hardware show that when GPU memory is oversubscribed, the EA framework achieves an average speedup of 1.74x over the conventional GPU prefetcher.

引用

页码：248 / 258

页数：11

共 12 条

[1] A Length Adaptive Memory Management Framework in High Speed Acquisition System
Chen, Xin
Ding, Haolun
Li, Xinyu
Li, Haiou
Liu, Yajun
Mei, Hong
Kang, Zhiwen
Song, Guolin
2022 9TH INTERNATIONAL FORUM ON ELECTRICAL ENGINEERING AND AUTOMATION, IFEEA, 2022, : 349 - 352
[2] DYNAMIC MEMORY MANAGEMENT IN THE LOCI FRAMEWORK
Zhang, Yang
Luke, Edward A.
SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2006, 7 (03): : 27 - 37
[3] A customisable memory management framework for C++
Attardi, G
Flagella, T
Iglio, P
SOFTWARE-PRACTICE & EXPERIENCE, 1998, 28 (11) : 1143 - 1183
[4] Zweilous: A Decoupled and Flexible Memory Management Framework
Li, Guoxi
Chen, Wenzhi
Xiang, Yang
IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (09) : 1350 - 1362
[5] An Intelligent Framework for Oversubscription Management in CPU-GPU Unified Memory
Long, Xinjian
Gong, Xiangyang
Zhang, Bo
Zhou, Huiyang
JOURNAL OF GRID COMPUTING, 2023, 21 (01)
[6] MNEMEE - A Framework for Memory Management and Optimization of Static and Dynamic Data in MPSoCs
Mallik, Arindam
Marwedel, Peter
Soudris, Dimitrios
Stuijk, Sander
PROCEEDINGS OF THE 2010 INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURES AND SYNTHESIS FOR EMBEDDED SYSTEMS (CASES '10), 2010, : 257 - 258
[7] An Intelligent Framework for Oversubscription Management in CPU-GPU Unified Memory
Xinjian Long
Xiangyang Gong
Bo Zhang
Huiyang Zhou
Journal of Grid Computing, 2023, 21
[8] A Thread-Oriented Memory Resource Management Framework for Mobile Edge Computing
Zhu, Zongwei
Wu, Fan
Cao, Jing
Li, Xi
Jia, Gangyong
IEEE ACCESS, 2019, 7 : 45881 - 45890
[9] An Adaptive Android Memory Management Based on a Lightweight PSO-LSTM Model
Zhao, Shupeng
Wang, Junbo
Yu, Songcan
Wang, Wanbin
2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
[10] Self-aware Memory: an adaptive memory management system for upcoming manycore architectures and its decentralized self-optimization process
Oliver Mattes
Wolfgang Karl
Design Automation for Embedded Systems, 2013, 17 : 739 - 769

← 1 2 →