Architecture of neural processing unit for deep neural networks

被引：15

作者：

Lee, Kyuho J. ^{[1
]}

机构：

[1] Ulsan Natl Inst Sci & Technol, Artificial Intelligence Grad Sch, Sch Elect & Comp Engn, Ulsan, South Korea

来源：

HARDWARE ACCELERATOR SYSTEMS FOR ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING | 2021年 / 122卷

基金：

新加坡国家研究基金会;

关键词：

ACCELERATOR;

D O I：

10.1016/bs.adcom.2020.11.001

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep Neural Networks (DNNs) have become a promising solution to inject AI in our daily lives from self-driving cars, smartphones, games, drones, etc. In most cases, DNNs were accelerated by server equipped with numerous computing engines, e.g., GPU, but recent technology advance requires energy-efficient acceleration of DNNs as the modern applications moved down to mobile computing nodes. Therefore, Neural Processing Unit (NPU) architectures dedicated to energy-efficient DNN acceleration became essential. Despite the fact that training phase of DNN requires precise number representations, many researchers proved that utilizing smaller bit-precision is enough for inference with low-power consumption. This led hardware architects to investigate energy-efficient NPU architectures with diverse HW-SW co-optimization schemes for inference. This chapter provides a review of several design examples of latest NPU architecture for DNN, mainly about inference engines. It also provides a discussion on the new architectural researches of neuromorphic computers and processing-in-memory architecture, and provides perspectives on the future research directions.

引用

页码：217 / 245

页数：29

共 50 条

[41] Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks?
Nurvitadhi, Eriko
Venkatesh, Ganesh
Sim, Jaewoong
Marr, Debbie
Huang, Randy
Ong, Jason Gee Hock
Liew, Yeong Tat
Srivatsan, Krishnan
Moss, Duncan
Subhaschandra, Suchit
Boudoukh, Guy
FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, : 5 - 14
[42] Deep Neural Networks Compiler for a Trace-Based Accelerator (Short WIP Paper)
Chang, Andre Xian Ming
Zaidy, Aliasger
Burzawa, Lukasz
Culurciello, Eugenio
ACM SIGPLAN NOTICES, 2018, 53 (06) : 89 - 93
[43] An Empirical Approach to Enhance Performance for Scalable CORDIC-Based Deep Neural Networks
Raut, Gopal
Karkun, Saurabh
Vishvakarma, Santosh Kumar
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2023, 16 (03)
[44] Elastic Filter Prune in Deep Neural Networks Using Modified Weighted Hybrid Criterion
Hu, Wei
Han, Yi
Liu, Fang
Hu, Mingce
Li, Xingyuan
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, KSEM 2024, 2024, 14884 : 16 - 27
[45] Two-Level Scheduling Algorithms for Deep Neural Network Inference in Vehicular Networks
Wu, Yalan
Wu, Jigang
Yao, Mianyang
Liu, Bosheng
Chen, Long
Lam, Siew Kei
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (09) : 9324 - 9343
[46] TRIM: A Design Space Exploration Model for Deep Neural Networks Inference and Training Accelerators
Qi, Yangjie
Zhang, Shuo
Taha, Tarek M.
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (05) : 1648 - 1661
[47] APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators
Achararit, Paniti
Hanif, Muhammad Abdullah
Putra, Rachmad Vidya Wicaksana
Shafique, Muhammad
Hara-Azumi, Yuko
IEEE ACCESS, 2020, 8 : 165319 - 165334
[48] Low Complexity Reconfigurable-Scalable Architecture Design Methodology for Deep Neural Network Inference Accelerator
Nimbekar, Anagha
Vatti, Chandrasekhara Srinivas
Dinesh, Y. V. Sai
Singh, Sunidhi
Gupta, Tarun
Chandrapu, Ramesh Reddy
Acharyya, Amit
2022 IEEE 35TH INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (IEEE SOCC 2022), 2022, : 83 - 88
[49] A Throughput-Optimized Channel-Oriented Processing Element Array for Convolutional Neural Networks
Chen, Yu-Xian
Ruan, Shanq-Jang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (02) : 752 - 756
[50] Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs
Ding, Caiwen
Ren, Ao
Yuan, Geng
Ma, Xiaolong
Li, Jiayu
Liu, Ning
Yuan, Bo
Wang, Yanzhi
PROCEEDINGS OF THE 2018 GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI'18), 2018, : 353 - 358

← 1 2 3 4 5 →