Efficient Use of GPU Memory for Large-Scale Deep Learning Model Training

被引:13
|
作者
Choi, Hyeonseong [1 ]
Lee, Jaehwan [1 ]
机构
[1] Korea Aerosp Univ, Sch Elect & Informat Engn, Goyang Si 10540, South Korea
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 21期
基金
新加坡国家研究基金会;
关键词
deep learning; large-scale model; CUDA Unified Memory; PyTorch;
D O I
10.3390/app112110377
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
To achieve high accuracy when performing deep learning, it is necessary to use a large-scale training model. However, due to the limitations of GPU memory, it is difficult to train large-scale training models within a single GPU. NVIDIA introduced a technology called CUDA Unified Memory with CUDA 6 to overcome the limitations of GPU memory by virtually combining GPU memory and CPU memory. In addition, in CUDA 8, memory advise options are introduced to efficiently utilize CUDA Unified Memory. In this work, we propose a newly optimized scheme based on CUDA Unified Memory to efficiently use GPU memory by applying different memory advise to each data type according to access patterns in deep learning training. We apply CUDA Unified Memory technology to PyTorch to see the performance of large-scale learning models through the expanded GPU memory. We conduct comprehensive experiments on how to efficiently utilize Unified Memory by applying memory advises when performing deep learning. As a result, when the data used for deep learning are divided into three types and a memory advise is applied to the data according to the access pattern, the deep learning execution time is reduced by 9.4% compared to the default Unified Memory.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] A Hybrid Deep Learning Model for Predicting Depression Symptoms From Large-Scale Textual Dataset
    Almutairi, Sulaiman
    Abohashrh, Mohammed
    Razzaq, Hasanain Hayder
    Zulqarnain, Muhammad
    Namoun, Abdallah
    Khan, Faheem
    IEEE ACCESS, 2024, 12 : 168477 - 168499
  • [42] Analysis of large deviations behavior of multi-GPU memory access in deep learning
    Tamizharasan, P. S.
    Ramasubramanian, N.
    JOURNAL OF SUPERCOMPUTING, 2018, 74 (05) : 2199 - 2212
  • [43] Large-scale 3D non-Cartesian coronary MRI reconstruction using distributed memory-efficient physics-guided deep learning with limited training data
    Zhang, Chi
    Piccini, Davide
    Demirel, Omer Burak
    Bonanno, Gabriele
    Roy, Christopher W.
    Yaman, Burhaneddin
    Moeller, Steen
    Shenoy, Chetan
    Stuber, Matthias
    Akcakaya, Mehmet
    MAGNETIC RESONANCE MATERIALS IN PHYSICS BIOLOGY AND MEDICINE, 2024, 37 (03): : 429 - 438
  • [44] Large-scale Restricted Boltzmann Machines on Single GPU
    Zhu, Yun
    Zhang, Yanqing
    Pan, Yi
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [45] A Lightweight Deep Compressive Model for Large-Scale Spike Compression
    Wu, Tong
    Zhao, Wenfeng
    Keefer, Edward
    Yang, Zhi
    2018 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS): ADVANCED SYSTEMS FOR ENHANCING HUMAN HEALTH, 2018, : 207 - 210
  • [46] Deep Learning Based Beam Training for Extremely Large-Scale Massive MIMO in Near-Field Domain
    Liu, Wang
    Ren, Hong
    Pan, Cunhua
    Wang, Jiangzhou
    IEEE COMMUNICATIONS LETTERS, 2023, 27 (01) : 170 - 174
  • [47] An Efficient and Accurate GPU-based Deep Learning Model for Multimedia Recommendation
    Djenouri, Youcef
    Belhadi, Asma
    Srivastava, Gautam
    Lin, Jerry Chun-Wei
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (02)
  • [48] DISTRIBUTED MEMORY-EFFICIENT PHYSICS-GUIDED DEEP LEARNING RECONSTRUCTION FOR LARGE-SCALE 3D NON-CARTESIAN MRI
    Zhang, Chi
    Piccini, Davide
    Demirel, Omer Burak
    Bonanno, Gabriele
    Yaman, Burhaneddin
    Stuber, Matthias
    Moeller, Steen
    Akcakaya, Mehmet
    2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
  • [49] LARGE-SCALE FACE IMAGE RETRIEVAL BASED ON HADOOP AND DEEP LEARNING
    Huang Yuanyuan
    Tang Yuan
    Xiong Taisong
    2020 17TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2020, : 326 - 329
  • [50] Approximate to Be Great: Communication Efficient and Privacy-Preserving Large-Scale Distributed Deep Learning in Internet of Things
    Du, Wei
    Li, Ang
    Zhou, Pan
    Xu, Zichuan
    Wang, Xiumin
    Jiang, Hao
    Wu, Dapeng
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (12) : 11678 - 11692