Adaptive Kernel-Width Selection for Kernel-Based Least-Squares Policy Iteration Algorithm

被引:0
作者
Wu, Jun [1 ]
Xu, Xin [1 ]
Zuo, Lei [1 ]
Li, Zhaobin [1 ]
Wang, Jian [1 ]
机构
[1] Natl Univ Def Technol, Inst Automat, Changsha 410073, Hunan, Peoples R China
来源
ADVANCES IN NEURAL NETWORKS - ISNN 2011, PT II | 2011年 / 6676卷
关键词
reinforcement learning; sparsification; least-squares; gradient descent; kernel width;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Kernel-based Least-squares Policy Iteration (KLSPI) algorithm provides a general reinforcement learning solution for large-scale Markov decision problems. In KLSPI, the Radial Basis Function (RBF) kernel is usually used to approximate the optimal value-function with high precision. However, selecting a proper kernel-width for the RBF kernel function is very important for KLSPI to be adopted successfully. In previous research, the kernel-width was usually set manually or calculated according to the sample distribution in advance, which requires prior knowledge or model information. In this paper, an adaptive kernel-width selection method is proposed for the KLSPI algorithm. Firstly, a sparsification procedure with neighborhood analysis based on the l(2)-ball of radius e is adopted, which helps obtain a reduced kernel dictionary without presetting the kernel-width. Secondly, a gradient descent method based on the Bellman Residual Error (BRE) is proposed so as to find out a kernel-width minimizing the sum of the BRE. The experimental results show the proposed method can help KLSPI approximate the true value-function more accurately, and, finally, obtain a better control policy.
引用
收藏
页码:611 / 619
页数:9
相关论文
共 46 条
  • [41] A general fuzzified CMAC based reinforcement learning control for ship steering using recursive least-squares algorithm
    Shen, Zhipeng
    Guo, Chen
    Zhang, Ning
    NEUROCOMPUTING, 2010, 73 (4-6) : 700 - 706
  • [42] Maximum Likelihood-Based Recursive Least-Squares Algorithm for Multivariable Systems with Colored Noises Using the Decomposition Technique
    Huafeng Xia
    Yan Ji
    Ling Xu
    Tasawar Hayat
    Circuits, Systems, and Signal Processing, 2019, 38 : 986 - 1004
  • [43] Identification of small-scale unmanned helicopter based on least squares and adaptive immune genetic algorithm
    Du, Yuhu
    Fang, Jiancheng
    Sheng, Wei
    Lei, Xusheng
    Jiqiren/Robot, 2012, 34 (01): : 72 - 77
  • [44] Maximum Likelihood-Based Recursive Least-Squares Algorithm for Multivariable Systems with Colored Noises Using the Decomposition Technique
    Xia, Huafeng
    Ji, Yan
    Xu, Ling
    Hayat, Tasawar
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (03) : 986 - 1004
  • [45] Performance analysis of the auxiliary model-based least-squares identification algorithm for one-step state-delay systems
    Ding, Feng
    Gu, Ya
    INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2012, 89 (15) : 2019 - 2028
  • [46] Filtering-based recursive least-squares identification algorithm for controlled autoregressive moving average systems using the maximum likelihood principle
    Li, Junhong
    Ding, Feng
    JOURNAL OF VIBRATION AND CONTROL, 2015, 21 (15) : 3098 - 3106