LDD: High-Precision Training of Deep Spiking Neural Network Transformers Guided by an Artificial Neural Network

被引:1
|
作者
Liu, Yuqian [1 ,2 ]
Zhao, Chujie [1 ,2 ]
Jiang, Yizhou [1 ,2 ]
Fang, Ying [3 ,4 ]
Chen, Feng [1 ,2 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[2] LSBDPA Beijing Key Lab, Beijing 100084, Peoples R China
[3] Fujian Normal Univ, Coll Comp & Cyber Secur, Fuzhou 350117, Peoples R China
[4] Fujian Normal Univ, Digital Fujian Internet of Thing Lab Environm Moni, Fuzhou 350117, Peoples R China
基金
中国国家自然科学基金;
关键词
spiking neural networks (SNNs); Transformer; distillation; image classification;
D O I
10.3390/biomimetics9070413
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The rise of large-scale Transformers has led to challenges regarding computational costs and energy consumption. In this context, spiking neural networks (SNNs) offer potential solutions due to their energy efficiency and processing speed. However, the inaccuracy of surrogate gradients and feature space quantization pose challenges for directly training deep SNN Transformers. To tackle these challenges, we propose a method (called LDD) to align ANN and SNN features across different abstraction levels in a Transformer network. LDD incorporates structured feature knowledge from ANNs to guide SNN training, ensuring the preservation of crucial information and addressing inaccuracies in surrogate gradients through designing layer-wise distillation losses. The proposed approach outperforms existing methods on the CIFAR10 (96.1%), CIFAR100 (82.3%), and ImageNet (80.9%) datasets, and enables training of the deepest SNN Transformer network using ImageNet.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] A Memristive Spiking Neural Network Circuit With Selective Supervised Attention Algorithm
    Deng, Zekun
    Wang, Chunhua
    Lin, Hairong
    Sun, Yichuang
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (08) : 2604 - 2617
  • [22] Deep Convolutional Generalized Classifier Neural Network
    Sarigul, Mehmet
    Ozyildirim, B. Melis
    Avci, Mutlu
    NEURAL PROCESSING LETTERS, 2020, 51 (03) : 2839 - 2854
  • [23] Exponential Decay Sine Wave Learning Rate for Fast Deep Neural Network Training
    An, Wangpeng
    Wang, Haoqian
    Zhang, Yulun
    Dai, Qionghai
    2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2017,
  • [24] Unification of popular artificial neural network activation functions
    Mostafanejad, Mohammad
    FRACTIONAL CALCULUS AND APPLIED ANALYSIS, 2024, 27 (06) : 3504 - 3526
  • [25] Selection of an artificial pre-training neural network for the classification of inland vessels based on their images
    Bobkowska, Katarzyna
    Bodus-Olkowska, Izabela
    SCIENTIFIC JOURNALS OF THE MARITIME UNIVERSITY OF SZCZECIN-ZESZYTY NAUKOWE AKADEMII MORSKIEJ W SZCZECINIE, 2021, 67 (139):
  • [26] Multitask Deep Neural Network With Knowledge-Guided Attention for Blind Image Quality Assessment
    Zhou, Tianwei
    Tan, Songbai
    Zhao, Baoquan
    Yue, Guanghui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 7577 - 7588
  • [27] A Perspective of the Noise Removal for Faster Neural Network Training
    Rajnoha, Martin
    Mikulec, Vojtech
    Burget, Radim
    Drazil, Jiri
    2019 11TH INTERNATIONAL CONGRESS ON ULTRA MODERN TELECOMMUNICATIONS AND CONTROL SYSTEMS AND WORKSHOPS (ICUMT), 2019,
  • [28] EFFICIENT NEURAL NETWORK TRAINING USING CURVELET FEATURES
    Hafiz, Abdul Rahman
    Al-Marzouqi, Hasan
    2016 IEEE 12TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), 2016,
  • [29] Hybrid Algorithm for the Optimization of Training Convolutional Neural Network
    Albeahdili, Hayder M.
    Han, Tony
    Islam, Naz E.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (10) : 79 - 85
  • [30] Review of Deep Convolution Neural Network in Image Classification
    Al-Saffar, Ahmed Ali Mohammed
    Tao, Hai
    Talab, Mohammed Ahmed
    2017 INTERNATIONAL CONFERENCE ON RADAR, ANTENNA, MICROWAVE, ELECTRONICS, AND TELECOMMUNICATIONS (ICRAMET), 2017, : 26 - 31