QC_SANE: Robust Control in DRL Using Quantile Critic With Spiking Actor and Normalized Ensemble

被引：2

作者：

Gupta, Surbhi ^{[1
]}

Singal, Gaurav ^{[2
]}

Garg, Deepak ^{[1
]}

Jagannathan, Sarangapani ^{[3
]}

机构：

[1] Bennett Univ, Dept Comp Sci Engn, Greater Noida 201310, Uttar Pradesh, India

[2] Netaji Subhas Univ Technol, Dept Comp Sci Engn, New Delhi 110078, India

[3] Missouri Univ Sci & Technol, Dept Elect & Comp Engn, Rolla, MO 65409 USA

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 09期

关键词：

Artificial neural networks; Neurons; Uncertainty; Task analysis; Robustness; Statistics; Sociology; Actor critic; deep reinforcement learning (DRL); ensemble; reinforcement learning (RL); robust control; spiking neural network (SNN);

D O I：

10.1109/TNNLS.2021.3129525

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently introduced deep reinforcement learning (DRL) techniques in discrete-time have resulted in significant advances in online games, robotics, and so on. Inspired from recent developments, we have proposed an approach referred to as Quantile Critic with Spiking Actor and Normalized Ensemble (QC_SANE) for continuous control problems, which uses quantile loss to train critic and a spiking neural network (NN) to train an ensemble of actors. The NN does an internal normalization using a scaled exponential linear unit (SELU) activation function and ensures robustness. The empirical study on multijoint dynamics with contact (MuJoCo)-based environments shows improved training and test results than the state-of-the-art approach: population coded spiking actor network (PopSAN).

引用

页码：6656 / 6662

页数：7

共 29 条

[1] Chung Y., 2020, ARXIV201109588
[2] Comsa JM, 2020, INT CONF ACOUST SPEE, P8529, DOI [10.1109/ICASSP40776.2020.9053856, 10.1109/icassp40776.2020.9053856]
[3] Deep Reinforcement Learning Techniques in Diversified Domains: A Survey
Gupta, Surbhi
Singal, Gaurav
Garg, Deepak
[J]. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2021, 28 (07) : 4715 - 4754
[4] Haarnoja T, 2018, PR MACH LEARN RES, V80
[5] Hans A., 2010, 2010 Ninth International Conference on Machine Learning and Applications (ICMLA 2010), P401, DOI 10.1109/ICMLA.2010.66
[6] Hessel M, 2018, AAAI CONF ARTIF INTE, P3215
[7] Hill A., 2018, Stable baselines
[8] Huang Z., 2017, ARXIV171208987
[9] Klambauer G., 2017, ADV NEURAL INFORM PR, P971, DOI DOI 10.5555/3294771.3294864
[10] Kurenkov Andrey, 2020, ARXIV190904121, P717

← 1 2 3 →