Gallatin: A General-Purpose GPU Memory Manager

被引:1
|
作者
McCoy, Hunter [1 ]
Pandey, Prashant [1 ]
机构
[1] Univ Utah, Salt Lake City, UT 84112 USA
来源
PROCEEDINGS OF THE 29TH ACM SIGPLAN ANNUAL SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, PPOPP 2024 | 2024年
关键词
GPU; Memory allocation; Concurrent data structures; High performance computing;
D O I
10.1145/3627535.3638499
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Dynamic memory management is critical for efficiently porting modern data processing pipelines to GPUs. However, building a general-purpose dynamic memory manager on GPUs is challenging due to the massive parallelism and weak memory coherence. Existing state-of-the-art GPU memory managers, Ouroboros and Reg-Eff, employ traditional data structures such as arrays and linked lists to manage memory objects. They build specialized pipelines to achieve performance for a fixed set of allocation sizes and fall back to the CUDA allocator for allocating large sizes. In the process, they lose general-purpose usability and fail to support critical applications such as streaming graph processing. In this paper, we introduce Gallatin, a general-purpose and high-performance GPU memory manager. Gallatin uses the van Emde Boas (vEB) tree data structure to manage memory objects efficiently and supports allocations of any size. Furthermore, we develop a highly-concurrentGPUimplementation of the vEB tree which can be broadly used in other GPU applications. It supports constant time insertions, deletions, and successor operations for a given memory size. In our evaluation, we compare Gallatin with state-of-theart specialized allocator variants. Gallatin is up to 374x faster on single-sized allocations and up to 264xfaster on mixed-size allocations than the next-best allocator. In scalability benchmarks, Gallatin is up to 254x times faster than the next-best allocator as the number of threads increases. For the graph benchmarks, Gallatin is 1.5x faster than the state-of-the-art for bulk insertions, slightly faster for bulk deletions, and is 3x faster than the next-best allocator for all graph expansion tests.
引用
收藏
页码:364 / 376
页数:13
相关论文
共 50 条
  • [1] TASK MANAGER FOR GENERAL-PURPOSE OPERATING SYSTEMS
    Martyshkin, Alexey, I
    TURISMO-ESTUDOS E PRATICAS, 2020,
  • [2] A Context Manager for General-purpose Operating Systems
    Olsen, Diogo
    Maziero, Carlos
    2012 BRAZILIAN SYMPOSIUM ON COMPUTING SYSTEM ENGINEERING (SBESC 2012), 2012, : 157 - 160
  • [3] SIFT Implementation and Optimization for General-Purpose GPU
    Heymann, S.
    Mueller, K.
    Smolic, A.
    Froelich, B.
    Wiegand, T.
    WSCG 2007, FULL PAPERS PROCEEDINGS I AND II, 2007, : 317 - +
  • [4] A performance model for general-purpose computation on GPU
    Institute of Computer Science and Technology, Peking University, Beijing 100871, China
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao, 2009, 9 (1219-1226):
  • [5] General-purpose computing on GPU Pixel processing
    Ockay, Milos
    2017 COMMUNICATION AND INFORMATION TECHNOLOGIES (KIT), 2017, : 115 - 118
  • [6] RFID manager - Providing a general-purpose RFID platform
    Katsunori, Noma
    Takahiro, Murakami
    NEC TECHNICAL JOURNAL, 2006, 1 (02): : 97 - 100
  • [7] A general purpose contention manager for software transactions on the GPU
    Shen, Qi
    Sharp, Craig
    Davison, Richard
    Ushaw, Gary
    Ranjan, Rajiv
    Zomaya, Albert Y.
    Morgan, Graham
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 139 (139) : 1 - 17
  • [8] Memory Encryption for General-Purpose Processors
    Gueron, Shay
    IEEE SECURITY & PRIVACY, 2016, 14 (06) : 54 - 62
  • [9] A GENERAL-PURPOSE MEMORY RELIABILITY SIMULATOR
    LIBSON, MR
    HARVEY, HE
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1984, 28 (02) : 196 - 205
  • [10] Contract-Based General-Purpose GPU Programming
    Kolesnichenko, Alexey
    Poskitt, Christopher M.
    Nanz, Sebastian
    Meyer, Bertrand
    GPCE'15: PROCEEDINGS OF THE 2015 ACM SIGPLAN INTERNATIONAL CONFERENCE ON GENERATIVE PROGRAMMING: CONCEPTS AND EXPERIENCES, 2015, : 75 - 84