Toward Energy-Efficient Spike-Based Deep Reinforcement Learning With Temporal Coding

被引:0
作者
Zhang, Malu [1 ]
Wang, Shuai [1 ]
Wu, Jibin [2 ]
Wei, Wenjie [1 ]
Zhang, Dehao [1 ]
Zhou, Zijian [1 ]
Wang, Siying [1 ]
Zhang, Fan [1 ]
Yang, Yang [1 ]
机构
[1] Univ Elect Sci & Technol China, Chengdu 610054, Peoples R China
[2] Hong Kong Polytech Univ, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Computational modeling; Biological system modeling; Decision making; Memory management; Deep reinforcement learning; Energy efficiency; Encoding; Real-time systems; Timing; Computational complexity; POWER;
D O I
10.1109/MCI.2025.3541572
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep reinforcement learning (DRL) facilitates efficient interaction with complex environments by enabling continuous optimization strategies and providing agents with autonomous learning abilities. However, traditional DRL methods often require large-scale neural networks and extensive computational resources, which limits their applicability in power-sensitive and resource-constrained edge environments, such as mobile robots and drones. To overcome these limitations, we leverage the energy-efficient properties of brain-inspired spiking neural networks (SNNs) to develop a novel spike-based DRL framework, referred to as Spike-DRL. Unlike traditional SNN-based reinforcement learning methods, Spike-DRL incorporates the energy-efficient time-to-first-spike (TTFS) encoding scheme, where information is encoded through the precise timing of a single spike. This TTFS-based method allows Spike-DRL to work in a sparse, event-driven manner, significantly reducing energy consumption. In addition, to improve the deployment capability of Spike-DRL in resource-constrained environments, a lightweight strategy for quantizing synaptic weights into low-bit representations is introduced, significantly reducing memory usage and computational complexity. Extensive experiments have been conducted to evaluate the performance of the proposed Spike-DRL, and the results show that our method achieves competitive performance with higher energy efficiency and lower memory requirements. This work presents a biologically inspired model that is well suited for real-time decision-making and autonomous learning in power-sensitive and resource-limited edge environments.
引用
收藏
页码:45 / 57
页数:13
相关论文
共 84 条
[71]   Attention Spiking Neural Networks [J].
Yao, Man ;
Zhao, Guangshe ;
Zhang, Hengyu ;
Hu, Yifan ;
Deng, Lei ;
Tian, Yonghong ;
Xu, Bo ;
Li, Guoqi .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) :9393-9410
[72]   Reinforcement Learning in Spiking Neural Networks with Stochastic and Deterministic Synapses [J].
Yuan, Mengwen ;
Wu, Xi ;
Yan, Rui ;
Tang, Huajin .
NEURAL COMPUTATION, 2019, 31 (12) :2368-2389
[73]  
Zhang DZ, 2022, AAAI CONF ARTIF INTE, P59
[74]  
Zhang JZ, 2021, AAAI CONF ARTIF INTE, V35, P10887
[75]  
Zhang L., 2021, arXiv
[76]   Rectified Linear Postsynaptic Potential Function for Backpropagation in Deep Spiking Neural Networks [J].
Zhang, Malu ;
Wang, Jiadong ;
Wu, Jibin ;
Belatreche, Ammar ;
Amornpaisannon, Burin ;
Zhang, Zhixuan ;
Miriyala, Venkata Pavan Kumar ;
Qu, Hong ;
Chua, Yansong ;
Carlson, Trevor E. ;
Li, Haizhou .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (05) :1947-1958
[77]   Adaptive Safe Reinforcement Learning With Full-State Constraints and Constrained Adaptation for Autonomous Vehicles [J].
Zhang, Yuxiang ;
Liang, Xiaoling ;
Li, Dongyu ;
Ge, Shuzhi Sam ;
Gao, Bingzhao ;
Chen, Hong ;
Lee, Tong Heng .
IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (03) :1907-1920
[78]   A Collaborative Multiagent Reinforcement Learning Method Based on Policy Gradient Potential [J].
Zhang, Zhen ;
Ong, Yew-Soon ;
Wang, Dongqing ;
Xue, Binqiang .
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (02) :1015-1027
[79]  
Zhao WS, 2020, 2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), P737, DOI [10.1109/ssci47803.2020.9308468, 10.1109/SSCI47803.2020.9308468]
[80]  
Zheng Z., 2018, P ADV NEUR INF PROC, V31, P1861