Neural Q-Learning Based on Residual Gradient for Nonlinear Control Systems

被引:0
|
作者
Si, Yanna [1 ]
Pu, Jiexin [1 ]
Zang, Shaofei [1 ]
机构
[1] Henan Univ Sci & Technol, Sch Informat Engn, Luoyang, Peoples R China
来源
ICCAIS 2019: THE 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES | 2019年
关键词
Q-learning; feedforward neural network; value function approximation; residual gradient method; nonlinear control systems;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To solve the control problem of nonlinear system under continuous state space, this paper puts forward a neural Q-learning algorithm based on residual gradient method. Firstly, the multi-layer feedforward neural network is utilized to approximate the Q-value function, overcoming the "dimensional disaster" in the classical reinforcement learning. Then based on the residual gradient method, a mini-batch gradient descent is implemented by the experience replay to update the neural network parameters, which can effectively reduce the iterations number and increase the learning speed. Moreover, the momentum optimization method is introduced to ensure the stability of the training process further and improve the convergence. In order to balance exploration and utilization better, epsilon-decreasing strategy replaces epsilon-greedy for action selection. The simulation results of CartPole control task show the correctness and effectiveness of the proposed algorithm.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] LEARNING HOSE TRANSPORT CONTROL WITH Q-LEARNING
    Fernandez-Gauna, Borja
    Manuel Lopez-Guede, Jose
    Zulueta, Ekaitz
    Grana, Manuel
    NEURAL NETWORK WORLD, 2010, 20 (07) : 913 - 923
  • [22] Mobile robot navigation: neural Q-learning
    Parasuraman, S.
    Yun, Soh Chin
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2012, 44 (04) : 303 - 311
  • [23] Satisfaction based Q-learning for integrated lighting and blind control
    Cheng, Zhijin
    Zhao, Qianchuan
    Wang, Fulin
    Jiang, Yi
    Xia, Li
    Ding, Jinlei
    ENERGY AND BUILDINGS, 2016, 127 : 43 - 55
  • [24] Behavior Control Algorithm for Mobile Robot Based on Q-Learning
    Yang, Shiqiang
    Li, Congxiao
    2017 INTERNATIONAL CONFERENCE ON COMPUTER NETWORK, ELECTRONIC AND AUTOMATION (ICCNEA), 2017, : 45 - 48
  • [25] Minimax Q-learning control for linear systems using the Wasserstein metric
    Zhao, Feiran
    You, Keyou
    AUTOMATICA, 2023, 149
  • [26] Mobile Robot Navigation: Neural Q-Learning
    Yun, Soh Chin
    Parasuraman, S.
    Ganapathy, V.
    ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY, VOL 3, 2013, 178 : 259 - +
  • [27] Off-Policy Interleaved Q-Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems
    Li, Jinna
    Chai, Tianyou
    Lewis, Frank L.
    Ding, Zhengtao
    Jiang, Yi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1308 - 1320
  • [28] Adjustable iterative Q-learning for advanced neural tracking control with stability guarantee
    Wang, Yuan
    Wang, Ding
    Zhao, Mingming
    Liu, Ao
    Qiao, Junfei
    NEUROCOMPUTING, 2024, 584
  • [29] Cooperative Q-Learning Based on Learning Automata
    Yang, Mao
    Tian, Yantao
    Qi, Xinyue
    2009 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND LOGISTICS ( ICAL 2009), VOLS 1-3, 2009, : 1972 - 1977
  • [30] Compatibility and Performance Improvement of the WPT Systems Based on Q-Learning Algorithm
    Liu, Xu
    Chao, Jie
    Rong, Cancan
    Liao, Zhijuan
    Xia, Chenyang
    IEEE TRANSACTIONS ON POWER ELECTRONICS, 2024, 39 (08) : 10582 - 10593