Architecture of neural processing unit for deep neural networks

被引:15
|
作者
Lee, Kyuho J. [1 ]
机构
[1] Ulsan Natl Inst Sci & Technol, Artificial Intelligence Grad Sch, Sch Elect & Comp Engn, Ulsan, South Korea
来源
HARDWARE ACCELERATOR SYSTEMS FOR ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING | 2021年 / 122卷
基金
新加坡国家研究基金会;
关键词
ACCELERATOR;
D O I
10.1016/bs.adcom.2020.11.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep Neural Networks (DNNs) have become a promising solution to inject AI in our daily lives from self-driving cars, smartphones, games, drones, etc. In most cases, DNNs were accelerated by server equipped with numerous computing engines, e.g., GPU, but recent technology advance requires energy-efficient acceleration of DNNs as the modern applications moved down to mobile computing nodes. Therefore, Neural Processing Unit (NPU) architectures dedicated to energy-efficient DNN acceleration became essential. Despite the fact that training phase of DNN requires precise number representations, many researchers proved that utilizing smaller bit-precision is enough for inference with low-power consumption. This led hardware architects to investigate energy-efficient NPU architectures with diverse HW-SW co-optimization schemes for inference. This chapter provides a review of several design examples of latest NPU architecture for DNN, mainly about inference engines. It also provides a discussion on the new architectural researches of neuromorphic computers and processing-in-memory architecture, and provides perspectives on the future research directions.
引用
收藏
页码:217 / 245
页数:29
相关论文
共 50 条
  • [41] Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks?
    Nurvitadhi, Eriko
    Venkatesh, Ganesh
    Sim, Jaewoong
    Marr, Debbie
    Huang, Randy
    Ong, Jason Gee Hock
    Liew, Yeong Tat
    Srivatsan, Krishnan
    Moss, Duncan
    Subhaschandra, Suchit
    Boudoukh, Guy
    FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, : 5 - 14
  • [42] Deep Neural Networks Compiler for a Trace-Based Accelerator (Short WIP Paper)
    Chang, Andre Xian Ming
    Zaidy, Aliasger
    Burzawa, Lukasz
    Culurciello, Eugenio
    ACM SIGPLAN NOTICES, 2018, 53 (06) : 89 - 93
  • [43] An Empirical Approach to Enhance Performance for Scalable CORDIC-Based Deep Neural Networks
    Raut, Gopal
    Karkun, Saurabh
    Vishvakarma, Santosh Kumar
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2023, 16 (03)
  • [44] Elastic Filter Prune in Deep Neural Networks Using Modified Weighted Hybrid Criterion
    Hu, Wei
    Han, Yi
    Liu, Fang
    Hu, Mingce
    Li, Xingyuan
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, KSEM 2024, 2024, 14884 : 16 - 27
  • [45] Two-Level Scheduling Algorithms for Deep Neural Network Inference in Vehicular Networks
    Wu, Yalan
    Wu, Jigang
    Yao, Mianyang
    Liu, Bosheng
    Chen, Long
    Lam, Siew Kei
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (09) : 9324 - 9343
  • [46] TRIM: A Design Space Exploration Model for Deep Neural Networks Inference and Training Accelerators
    Qi, Yangjie
    Zhang, Shuo
    Taha, Tarek M.
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (05) : 1648 - 1661
  • [47] APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators
    Achararit, Paniti
    Hanif, Muhammad Abdullah
    Putra, Rachmad Vidya Wicaksana
    Shafique, Muhammad
    Hara-Azumi, Yuko
    IEEE ACCESS, 2020, 8 : 165319 - 165334
  • [48] Low Complexity Reconfigurable-Scalable Architecture Design Methodology for Deep Neural Network Inference Accelerator
    Nimbekar, Anagha
    Vatti, Chandrasekhara Srinivas
    Dinesh, Y. V. Sai
    Singh, Sunidhi
    Gupta, Tarun
    Chandrapu, Ramesh Reddy
    Acharyya, Amit
    2022 IEEE 35TH INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (IEEE SOCC 2022), 2022, : 83 - 88
  • [49] A Throughput-Optimized Channel-Oriented Processing Element Array for Convolutional Neural Networks
    Chen, Yu-Xian
    Ruan, Shanq-Jang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (02) : 752 - 756
  • [50] Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs
    Ding, Caiwen
    Ren, Ao
    Yuan, Geng
    Ma, Xiaolong
    Li, Jiayu
    Liu, Ning
    Yuan, Bo
    Wang, Yanzhi
    PROCEEDINGS OF THE 2018 GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI'18), 2018, : 353 - 358