An improved Q-learning algorithm based on exploration region expansion strategy

被引:0
作者
Gao, Qingji [1 ]
Hong, Bingong [1 ]
He, Zhendong [2 ]
Liu, Jie [2 ]
Niu, Guochen [2 ]
机构
[1] Harbin Inst Technol, Dept Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Civil Aviat Univ China, Inst Res Robot, Tianjin 300300, Peoples R China
来源
WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS | 2006年
关键词
Q-learning; exploration region expansion; exploration-exploitation; Metropolis criterion;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In order to find a good solution to one of the key problems in Q-learning algorithm-keeping the balance between exploration and exploitation, an improved Q-learning algorithm based on exploration region expansion strategy is proposed on the base of Metropolis criterion-based Q-learning. With this strategy, the exploration blindness in the entire environment is eliminated, and the learning efficiency is increased. Meanwhile, other feasible path is sought where agent encounters obstacles, which makes the implementation of the algorithm on real robot easy. An automatic termination condition is also put forward, therefore, the redundant learning after finding optimal path is avoided, and the time of learning is reduced. The validity of the algorithm is proved by simulation experiments.
引用
收藏
页码:4167 / +
页数:2
相关论文
共 50 条
  • [21] Q-learning with heterogeneous update strategy
    Tan, Tao
    Xie, Hong
    Feng, Liang
    [J]. INFORMATION SCIENCES, 2024, 656
  • [22] Improved Q-Learning Algorithm Based on Approximate State Matching in Agricultural Plant Protection Environment
    Sun, Fengjie
    Wang, Xianchang
    Zhang, Rui
    [J]. ENTROPY, 2021, 23 (06)
  • [23] Can Q-learning be Improved with Advice?
    Golowich, Noah
    Moitra, Ankur
    [J]. CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178
  • [24] Q-Learning Based Forwarding Strategy in Named Data Networks
    Hnaien, Hend
    Touati, Haifa
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2020, PT I, 2020, 12249 : 434 - 444
  • [25] Fault diagnosis of track circuit based on improved sparrow search algorithm and Q-Learning optimization for ensemble learning
    Xu K.
    Zheng H.
    Tu Y.
    Wu S.
    [J]. Journal of Railway Science and Engineering, 2023, 20 (11) : 4426 - 4437
  • [26] ENHANCEMENTS OF FUZZY Q-LEARNING ALGORITHM
    Glowaty, Grzegorz
    [J]. COMPUTER SCIENCE-AGH, 2005, 7 : 77 - 87
  • [27] Heuristically accelerated Q-learning algorithm based on Laplacian Eigenmap
    Zhu, Mei-Qiang
    Li, Ming
    Cheng, Yu-Hu
    Zhang, Qian
    Wang, Xue-Song
    [J]. Kongzhi yu Juece/Control and Decision, 2014, 29 (03): : 425 - 430
  • [28] Design of cognitive radar jamming based on Q-learning algorithm
    Li, Yun-Jie
    Zhu, Yun-Peng
    Gao, Mei-Guo
    [J]. Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2015, 35 (11): : 1194 - 1199
  • [29] Q-learning based Air Combat Target Assignment Algorithm
    Luo, Peng-Cheng
    Xie, Jun-jie
    Che, Wan-Fang
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 779 - 783
  • [30] Dynamic feature selection algorithm based on Q-learning mechanism
    Ruohao Xu
    Mengmeng Li
    Zhongliang Yang
    Lifang Yang
    Kangjia Qiao
    Zhigang Shang
    [J]. Applied Intelligence, 2021, 51 : 7233 - 7244