Neural Q-Learning Based on Residual Gradient for Nonlinear Control Systems

被引：0

作者：

Si, Yanna ^{[1
]}

Pu, Jiexin ^{[1
]}

Zang, Shaofei ^{[1
]}

机构：

[1] Henan Univ Sci & Technol, Sch Informat Engn, Luoyang, Peoples R China

来源：

ICCAIS 2019: THE 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES | 2019年

关键词：

Q-learning; feedforward neural network; value function approximation; residual gradient method; nonlinear control systems;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To solve the control problem of nonlinear system under continuous state space, this paper puts forward a neural Q-learning algorithm based on residual gradient method. Firstly, the multi-layer feedforward neural network is utilized to approximate the Q-value function, overcoming the "dimensional disaster" in the classical reinforcement learning. Then based on the residual gradient method, a mini-batch gradient descent is implemented by the experience replay to update the neural network parameters, which can effectively reduce the iterations number and increase the learning speed. Moreover, the momentum optimization method is introduced to ensure the stability of the training process further and improve the convergence. In order to balance exploration and utilization better, epsilon-decreasing strategy replaces epsilon-greedy for action selection. The simulation results of CartPole control task show the correctness and effectiveness of the proposed algorithm.

引用

页数：5

共 50 条

[21] LEARNING HOSE TRANSPORT CONTROL WITH Q-LEARNING
Fernandez-Gauna, Borja
Manuel Lopez-Guede, Jose
Zulueta, Ekaitz
Grana, Manuel
NEURAL NETWORK WORLD, 2010, 20 (07) : 913 - 923
[22] Mobile robot navigation: neural Q-learning
Parasuraman, S.
Yun, Soh Chin
INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2012, 44 (04) : 303 - 311
[23] Satisfaction based Q-learning for integrated lighting and blind control
Cheng, Zhijin
Zhao, Qianchuan
Wang, Fulin
Jiang, Yi
Xia, Li
Ding, Jinlei
ENERGY AND BUILDINGS, 2016, 127 : 43 - 55
[24] Behavior Control Algorithm for Mobile Robot Based on Q-Learning
Yang, Shiqiang
Li, Congxiao
2017 INTERNATIONAL CONFERENCE ON COMPUTER NETWORK, ELECTRONIC AND AUTOMATION (ICCNEA), 2017, : 45 - 48
[25] Minimax Q-learning control for linear systems using the Wasserstein metric
Zhao, Feiran
You, Keyou
AUTOMATICA, 2023, 149
[26] Mobile Robot Navigation: Neural Q-Learning
Yun, Soh Chin
Parasuraman, S.
Ganapathy, V.
ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY, VOL 3, 2013, 178 : 259 - +
[27] Off-Policy Interleaved Q-Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems
Li, Jinna
Chai, Tianyou
Lewis, Frank L.
Ding, Zhengtao
Jiang, Yi
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1308 - 1320
[28] Adjustable iterative Q-learning for advanced neural tracking control with stability guarantee
Wang, Yuan
Wang, Ding
Zhao, Mingming
Liu, Ao
Qiao, Junfei
NEUROCOMPUTING, 2024, 584
[29] Cooperative Q-Learning Based on Learning Automata
Yang, Mao
Tian, Yantao
Qi, Xinyue
2009 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND LOGISTICS ( ICAL 2009), VOLS 1-3, 2009, : 1972 - 1977
[30] Compatibility and Performance Improvement of the WPT Systems Based on Q-Learning Algorithm
Liu, Xu
Chao, Jie
Rong, Cancan
Liao, Zhijuan
Xia, Chenyang
IEEE TRANSACTIONS ON POWER ELECTRONICS, 2024, 39 (08) : 10582 - 10593

← 1 2 3 4 5 →