Neural Q-Learning Based on Residual Gradient for Nonlinear Control Systems

被引：0

作者：

Si, Yanna ^{[1
]}

Pu, Jiexin ^{[1
]}

Zang, Shaofei ^{[1
]}

机构：

[1] Henan Univ Sci & Technol, Sch Informat Engn, Luoyang, Peoples R China

来源：

ICCAIS 2019: THE 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES | 2019年

关键词：

Q-learning; feedforward neural network; value function approximation; residual gradient method; nonlinear control systems;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To solve the control problem of nonlinear system under continuous state space, this paper puts forward a neural Q-learning algorithm based on residual gradient method. Firstly, the multi-layer feedforward neural network is utilized to approximate the Q-value function, overcoming the "dimensional disaster" in the classical reinforcement learning. Then based on the residual gradient method, a mini-batch gradient descent is implemented by the experience replay to update the neural network parameters, which can effectively reduce the iterations number and increase the learning speed. Moreover, the momentum optimization method is introduced to ensure the stability of the training process further and improve the convergence. In order to balance exploration and utilization better, epsilon-decreasing strategy replaces epsilon-greedy for action selection. The simulation results of CartPole control task show the correctness and effectiveness of the proposed algorithm.

引用

页数：5

共 50 条

[31] Q-LEARNING BASED THERAPY MODELING
Jacak, Witold
Proell, Karin
EMSS 2009: 21ST EUROPEAN MODELING AND SIMULATION SYMPOSIUM, VOL II, 2009, : 204 - +
[32] Expertness based cooperative Q-learning
Ahmadabadi, MN
Asadpour, M
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2002, 32 (01): : 66 - 76
[33] An Algorithm Based on Q-learning for Solving Frequency Assignment in RFID Systems
Hu Shengbo
2008 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-11, 2008, : 225 - 229
[34] Q-learning Based Backup for Energy Harvesting Powered Embedded Systems
Fan, Wei
Zhang, Yujie
Song, Weining
Zhao, Mengying
Shen, Zhaoyan
Jia, Zhiping
PROCEEDINGS OF THE 2020 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2020), 2020, : 1247 - 1252
[35] Generating reconfiguration blueprints for IMA systems based on improved Q-learning
Luo Q.
Zhang T.
Shan P.
Zhang W.
Liu Z.
Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2021, 42 (08):
[36] Optimal operational control for industrial processes based on Q-learning method
Li, Jinna
Gao, Xize
Yuan, Decheng
Fan, Jialu
PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 2562 - 2567
[37] Q-Learning Based Autonomous Control of the Auxiliary Power Network of a Ship
Huotari, Janne
Ritari, Antti
Ojala, Risto
Vepsalainen, Jari
Tammi, Kari
IEEE ACCESS, 2019, 7 : 152879 - 152890
[38] Online Q-learning for stochastic linear systems with state and control dependent noise
Zhu, Hongxu
Wang, Wei
Wang, Xiaoliang
Wu, Shufan
Sun, Ran
APPLIED SOFT COMPUTING, 2024, 167
[39] Novel Static Security and Stability Control of Power Systems Based on Artificial Emotional Lazy Q-Learning
Bao T.
Ma X.
Li Z.
Yang D.
Wang P.
Zhou C.
Energy Engineering: Journal of the Association of Energy Engineering, 2024, 121 (06): : 1713 - 1737
[40] Q-learning for risk-sensitive control
Borkar, VS
MATHEMATICS OF OPERATIONS RESEARCH, 2002, 27 (02) : 294 - 311

← 1 2 3 4 5 →