Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators

被引:1
|
作者
Pal, Subhankar [1 ]
Venkataramani, Swagath [2 ]
Srinivasan, Viji [2 ]
Gopalakrishnan, Kailash [2 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA
来源
2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021) | 2021年
关键词
PERFORMANCE;
D O I
10.1109/ISPASS51385.2021.00046
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A prevalent challenge for Deep Learning (DL) accelerators is how they are programmed to sustain utilization without impacting end-user productivity. Little prior effort has been devoted to the effective management of their on-chip Scratch-Pad Memory (SPM) across the DL operations of a Deep Neural Network (DNN). This is especially critical due to trends in complex network topologies and the emergence of eager execution. This work demonstrates that there exists up to a 5.2x performance gap in DL inference to be bridged using SPM management, on a set of image, object and language networks. We propose OnSRAM, a novel SPM management framework integrated with a DL accelerator runtime. OnSRAM has two variants, viz. OnSRAM-Static, which works on static graphs to identify data structures that should be held on-chip based on their properties, and OnSRAM-Eager, which targets an eager execution model (no graph) and uses a speculative scheme to hold/discard data structures. On a prototypical DL accelerator, OnSRAM-Static and OnSRAM-Eager achieve reductions in inference latency (batch size of 1) of 1.02-4.8x and 1.02-3.1x, respectively, over a baseline with no SPM management.
引用
收藏
页码:240 / 242
页数:3
相关论文
共 50 条
  • [21] Efficient and accurate identification of ear diseases using an ensemble deep learning model
    Zeng, Xinyu
    Jiang, Zifan
    Luo, Wen
    Li, Honggui
    Li, Hongye
    Li, Guo
    Shi, Jingyong
    Wu, Kangjie
    Liu, Tong
    Lin, Xing
    Wang, Fusen
    Li, Zhenzhang
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [22] Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
    Hoefler, Torsten
    Alistarh, Dan
    Ben-Nun, Tal
    Dryden, Nikoli
    Peste, Alexandra
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 23
  • [23] Efficient Fault Detection and Diagnosis in Photovoltaic System Using Deep Learning Technique*
    Marweni, Manel
    Fezai, Radhia
    Hajji, Mansour
    Mansouri, Majdi
    Bouzrara, Kais
    Nounou, Hazem
    Nounou, Mohamed
    2022 8TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'22), 2022, : 1336 - 1341
  • [24] Data-driven Adaptive Network Management with Deep Reinforcement Learning
    Ivoghlian, Ameer
    Wang, Kevin I-Kai
    Salcic, Zoran
    2021 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS DASC/PICOM/CBDCOM/CYBERSCITECH 2021, 2021, : 153 - 160
  • [25] Design of area-speed efficient Anurupyena Vedic multiplier for deep learning applications
    Kalaiselvi, C. M.
    Sabeenian, R. S.
    ANALOG INTEGRATED CIRCUITS AND SIGNAL PROCESSING, 2024, 119 (03) : 521 - 533
  • [26] Genetic and deep learning clusters based on neural networks for management decision structures
    Serrano, Will
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (09): : 4187 - 4211
  • [27] Hardware Efficient Implementation of an Ultrasonic Non-Destructive Evaluation Algorithm based on Deep Learning
    Yuan, Yu
    Virupakshappa, Kushal
    Oruklu, Erdal
    2022 IEEE 65TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS 2022), 2022,
  • [28] An efficient hardware architecture based on an ensemble of deep learning models for COVID-19 prediction
    Sakthivel, R.
    Thaseen, I. Sumaiya
    Vanitha, M.
    Deepa, M.
    Angulakshmi, M.
    Mangayarkarasi, R.
    Mahendran, Anand
    Alnumay, Waleed
    Chatterjee, Puspita
    SUSTAINABLE CITIES AND SOCIETY, 2022, 80
  • [29] Efficient and Effective NDVI Time-Series Reconstruction by Combining Deep Learning and Tensor Completion
    Li, Ang
    Jiang, Menghui
    Chu, Dong
    Guan, Xiaobin
    Li, Jie
    Shen, Huanfeng
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 191 - 205
  • [30] All-optical ultrafast ReLU function for energy-efficient nanophotonic deep learning
    Li, Gordon H. Y.
    Sekine, Ryoto
    Nehra, Rajveer
    Gray, Robert M.
    Ledezma, Luis
    Guo, Qiushi
    Marandi, Alireza
    NANOPHOTONICS, 2023, 12 (05) : 847 - 855