Neural Q-Learning Based on Residual Gradient for Nonlinear Control Systems

被引:0
|
作者
Si, Yanna [1 ]
Pu, Jiexin [1 ]
Zang, Shaofei [1 ]
机构
[1] Henan Univ Sci & Technol, Sch Informat Engn, Luoyang, Peoples R China
来源
ICCAIS 2019: THE 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES | 2019年
关键词
Q-learning; feedforward neural network; value function approximation; residual gradient method; nonlinear control systems;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To solve the control problem of nonlinear system under continuous state space, this paper puts forward a neural Q-learning algorithm based on residual gradient method. Firstly, the multi-layer feedforward neural network is utilized to approximate the Q-value function, overcoming the "dimensional disaster" in the classical reinforcement learning. Then based on the residual gradient method, a mini-batch gradient descent is implemented by the experience replay to update the neural network parameters, which can effectively reduce the iterations number and increase the learning speed. Moreover, the momentum optimization method is introduced to ensure the stability of the training process further and improve the convergence. In order to balance exploration and utilization better, epsilon-decreasing strategy replaces epsilon-greedy for action selection. The simulation results of CartPole control task show the correctness and effectiveness of the proposed algorithm.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Q-LEARNING BASED THERAPY MODELING
    Jacak, Witold
    Proell, Karin
    EMSS 2009: 21ST EUROPEAN MODELING AND SIMULATION SYMPOSIUM, VOL II, 2009, : 204 - +
  • [32] Expertness based cooperative Q-learning
    Ahmadabadi, MN
    Asadpour, M
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2002, 32 (01): : 66 - 76
  • [33] An Algorithm Based on Q-learning for Solving Frequency Assignment in RFID Systems
    Hu Shengbo
    2008 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-11, 2008, : 225 - 229
  • [34] Q-learning Based Backup for Energy Harvesting Powered Embedded Systems
    Fan, Wei
    Zhang, Yujie
    Song, Weining
    Zhao, Mengying
    Shen, Zhaoyan
    Jia, Zhiping
    PROCEEDINGS OF THE 2020 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2020), 2020, : 1247 - 1252
  • [35] Generating reconfiguration blueprints for IMA systems based on improved Q-learning
    Luo Q.
    Zhang T.
    Shan P.
    Zhang W.
    Liu Z.
    Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2021, 42 (08):
  • [36] Optimal operational control for industrial processes based on Q-learning method
    Li, Jinna
    Gao, Xize
    Yuan, Decheng
    Fan, Jialu
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 2562 - 2567
  • [37] Q-Learning Based Autonomous Control of the Auxiliary Power Network of a Ship
    Huotari, Janne
    Ritari, Antti
    Ojala, Risto
    Vepsalainen, Jari
    Tammi, Kari
    IEEE ACCESS, 2019, 7 : 152879 - 152890
  • [38] Online Q-learning for stochastic linear systems with state and control dependent noise
    Zhu, Hongxu
    Wang, Wei
    Wang, Xiaoliang
    Wu, Shufan
    Sun, Ran
    APPLIED SOFT COMPUTING, 2024, 167
  • [39] Novel Static Security and Stability Control of Power Systems Based on Artificial Emotional Lazy Q-Learning
    Bao T.
    Ma X.
    Li Z.
    Yang D.
    Wang P.
    Zhou C.
    Energy Engineering: Journal of the Association of Energy Engineering, 2024, 121 (06): : 1713 - 1737
  • [40] Q-learning for risk-sensitive control
    Borkar, VS
    MATHEMATICS OF OPERATIONS RESEARCH, 2002, 27 (02) : 294 - 311