Coreset: Hierarchical neuromorphic computing supporting large-scale neural networks with improved resource efficiency

被引:9
作者
Yang, Liwei [1 ]
Zhang, Huaipeng [1 ]
Luo, Tao [1 ]
Qu, Chuping [1 ]
Aung, Myat Thu Linn [1 ]
Cui, Yingnan [1 ]
Zhou, Jun [1 ]
Wong, Ming Ming [2 ]
Pu, Junran [3 ]
Do, Anh Tuan [2 ]
Goh, Rick Siow Mong [1 ]
Wong, Weng Fai [4 ]
机构
[1] ASTAR, Inst High Performance Comp, 1 Fusionopolis Way,16-16 Connexis, Singapore 138632, Singapore
[2] ASTAR, Inst Microelect, 2 Fusionopolis Way,08-02 Innovis Tower, Singapore 138634, Singapore
[3] Nanyang Technol Univ, Sch Elect & Elect Engn, 50 Nanyang Ave, Singapore 639798, Singapore
[4] Natl Univ Singapore, Dept Comp Sci, Comp 1,13 Comp Dr, Singapore 117417, Singapore
关键词
Neuromorphic computing; Spiking neural networks; Constraints-compatible mapping; Resource-efficient compilation; Large-scale SNN;
D O I
10.1016/j.neucom.2021.12.021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Crossbar-based neuromorphic chips promise improved energy efficiency for spiking neural networks (SNNs), but suffer from the limited fan-in/fan-out constraints and resource mapping inefficiency. In this paper, we propose a new hardware mechanism to enable configurable combination of cores, called core set. Using this hierarchical method, our end-to-end CSM (which stands for the 'CoreSet Method') framework efficiently solves the fan-in/fan-out issues and significantly improves the resource efficiency. Experiment results show that CSM can efficiently support complex network structures as well as significantly improving accuracies. Up to 4.6% improvement compared with those achieved by other neuromorphic chips (i.e. IBM TrueNorth and Intel Loihi), on the CIFAR-10, CIFAR-100 and SVHN datasets is achieved, matching the accuracies of state-of-the-art SNN models. In addition, compared with IBM TrueNorth, CSM achieves improvements of up to 18.5x, 6.04x and 3.33x in memory efficiency, core efficiency and extrapolated throughput, respectively, thus enabling support for large-scale modern networks (such as VGG). In fact, our method can find optimal core sizes for minimal silicon area. As a proof of concept, we have implemented an FPGA emulation of coreset-supported neuromorphic computing. It achieves up to 7, 737x speed-up compared to software simulation, thus not only facilitating SNN structure exploration and verification in a timely manner, but also enabling earlier prototyping for better neuromorphic hardware performance investigation. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:128 / 140
页数:13
相关论文
共 45 条
  • [1] A Mixed-Signal Structured AdEx Neuron for Accelerated Neuromorphic Cores
    Aamir, Syed Ahmed
    Mueller, Paul
    Kiene, Gerd
    Kriener, Laura
    Stradmann, Yannik
    Gruebl, Andreas
    Schemmel, Johannes
    Meier, Karlheinz
    [J]. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2018, 12 (05) : 1027 - 1037
  • [2] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [3] Mapping Spiking Neural Networks to Neuromorphic Hardware
    Balaji, Adarsha
    Das, Anup
    Wu, Yuefeng
    Huynh, Khanh
    Dell'Anna, Francesco G.
    Indiveri, Giacomo
    Krichmar, Jeffrey L.
    Dutt, Nikil D.
    Schaafsma, Siebren
    Catthoor, Francky
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) : 76 - 86
  • [4] Bellec G., 2017, abs/1711.05136
  • [5] Brainchip Holding Ltd, AK NEUR PROC
  • [6] Bray T, 2017, The JavaScript Object Notation (JSON) Data Interchange Format
  • [7] Neuromorphic computing using non-volatile memory
    Burr, Geoffrey W.
    Shelby, Robert M.
    Sebastian, Abu
    Kim, Sangbum
    Kim, Seyoung
    Sidler, Severin
    Virwani, Kumar
    Ishii, Masatoshi
    Narayanan, Pritish
    Fumarola, Alessandro
    Sanches, Lucas L.
    Boybat, Irem
    Le Gallo, Manuel
    Moon, Kibong
    Woo, Jiyoo
    Hwang, Hyunsang
    Leblebici, Yusuf
    [J]. ADVANCES IN PHYSICS-X, 2017, 2 (01): : 89 - 124
  • [8] A fully integrated reprogrammable memristor-CMOS system for efficient multiply-accumulate operations
    Cai, Fuxi
    Correll, Justin M.
    Lee, Seung Hwan
    Lim, Yong
    Bothra, Vishishtha
    Zhang, Zhengya
    Flynn, Michael P.
    Lu, Wei D.
    [J]. NATURE ELECTRONICS, 2019, 2 (07) : 290 - 299
  • [9] Cheng HP, 2017, DES AUT TEST EUROPE, P139, DOI 10.23919/DATE.2017.7926972
  • [10] CASCADE: Connecting RRAMs to Extend Analog Dataflow In An End-To-End In-Memory Processing Paradigm
    Chou, Teyuh
    Tang, Wei
    Botimer, Jacob
    Zhang, Zhengya
    [J]. MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 114 - 125