Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators

被引:1
|
作者
Pal, Subhankar [1 ]
Venkataramani, Swagath [2 ]
Srinivasan, Viji [2 ]
Gopalakrishnan, Kailash [2 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA
来源
2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021) | 2021年
关键词
PERFORMANCE;
D O I
10.1109/ISPASS51385.2021.00046
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A prevalent challenge for Deep Learning (DL) accelerators is how they are programmed to sustain utilization without impacting end-user productivity. Little prior effort has been devoted to the effective management of their on-chip Scratch-Pad Memory (SPM) across the DL operations of a Deep Neural Network (DNN). This is especially critical due to trends in complex network topologies and the emergence of eager execution. This work demonstrates that there exists up to a 5.2x performance gap in DL inference to be bridged using SPM management, on a set of image, object and language networks. We propose OnSRAM, a novel SPM management framework integrated with a DL accelerator runtime. OnSRAM has two variants, viz. OnSRAM-Static, which works on static graphs to identify data structures that should be held on-chip based on their properties, and OnSRAM-Eager, which targets an eager execution model (no graph) and uses a speculative scheme to hold/discard data structures. On a prototypical DL accelerator, OnSRAM-Static and OnSRAM-Eager achieve reductions in inference latency (batch size of 1) of 1.02-4.8x and 1.02-3.1x, respectively, over a baseline with no SPM management.
引用
收藏
页码:240 / 242
页数:3
相关论文
共 50 条
  • [31] DeepSwapper: A Deep Learning Based Page Swap Management Scheme for Hybrid Memory Systems
    Beigi, Majed Valad
    Pourshirazi, Bahareh
    Memik, Gokhan
    Zhu, Zhichun
    PACT '20: PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2020, : 353 - 354
  • [32] A deep learning-based financial hedging approach for the effective management of commodity risks
    Hu, Yan
    Ni, Jian
    JOURNAL OF FUTURES MARKETS, 2024, 44 (06) : 879 - 900
  • [33] Online portfolio management via deep reinforcement learning with high-frequency data
    Li, Jiahao
    Zhang, Yong
    Yang, Xingyu
    Chen, Liangwei
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [34] A security data detection and management method in digital library network based on deep learning
    Zhu, Diyin
    Wei, Yihang
    Cai, Jiali
    Wang, Jingwen
    Chen, Zhongshan
    FRONTIERS IN PHYSICS, 2025, 12
  • [35] Industry 4.0 Lean Shopfloor Management Characterization Using EEG Sensors and Deep Learning
    Schmidt, Daniel
    Villalba Diez, Javier
    Ordieres-Mere, Joaquin
    Gevers, Roman
    Schwiep, Joerg
    Molina, Martin
    SENSORS, 2020, 20 (10)
  • [36] Anomaly analysis on indoor office spaces for facility management using deep learning methods
    Jung, YooSeok
    Kang, TaeWook
    Chun, Chanjun
    JOURNAL OF BUILDING ENGINEERING, 2021, 43
  • [37] A Review on Machine/Deep Learning Techniques Applied to Building Energy Simulation, Optimization and Management
    Villano, Francesca
    Mauro, Gerardo Maria
    Pedace, Alessia
    THERMO, 2024, 4 (01): : 100 - 139
  • [38] Efficient application of deep learning-based elective lymph node regions delineation for pelvic malignancies
    Wen, Feng
    Zhou, Jie
    Chen, Zhebin
    Dou, Meng
    Yao, Yu
    Wang, Xin
    Xu, Feng
    Shen, Yali
    MEDICAL PHYSICS, 2024, 51 (10) : 7057 - 7066
  • [39] Energy-efficient resource allocation over wireless communication systems through deep reinforcement learning
    Shukla, Kirti
    Kollu, Archana
    Panwar, Poonam
    Soni, Mukesh
    Jindal, Latika
    Patel, Hemlata
    Keshta, Ismail
    Maaliw III, Renato R. R.
    INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2025, 38 (01)
  • [40] Cognitive load theory vs. constructivist approaches: which best leads to efficient, deep learning?
    Vogel-Walcutt, J. J.
    Gebrim, J. B.
    Bowers, C.
    Carper, T. M.
    Nicholson, D.
    JOURNAL OF COMPUTER ASSISTED LEARNING, 2011, 27 (02) : 133 - 145