Geodesic Gaussian kernels for value function approximation

被引:0
作者
Masashi Sugiyama
Hirotaka Hachiya
Christopher Towell
Sethu Vijayakumar
机构
[1] Tokyo Institute of Technology,Department of Computer Science
[2] University of Edinburgh,School of Informatics
来源
Autonomous Robots | 2008年 / 25卷
关键词
Reinforcement learning; Value function approximation; Markov decision process; Least-squares policy iteration; Gaussian kernel;
D O I
暂无
中图分类号
学科分类号
摘要
The least-squares policy iteration approach works efficiently in value function approximation, given appropriate basis functions. Because of its smoothness, the Gaussian kernel is a popular and useful choice as a basis function. However, it does not allow for discontinuity which typically arises in real-world reinforcement learning tasks. In this paper, we propose a new basis function based on geodesic Gaussian kernels, which exploits the non-linear manifold structure induced by the Markov decision processes. The usefulness of the proposed method is successfully demonstrated in simulated robot arm control and Khepera robot navigation.
引用
收藏
页码:287 / 304
页数:17
相关论文
共 17 条
[1]  
Coifman R.(2006)Diffusion wavelets Applied and Computational Harmonic Analysis 21 53-94
[2]  
Maggioni M.(1959)A note on two problems in connexion with graphs Numerische Mathematik 1 269-271
[3]  
Dijkstra E. W.(1987)Fibonacci heaps and their uses in improved network optimization algorithms Journal of the ACM 34 569-615
[4]  
Fredman M. L.(1995)Regularization theory and neural networks architectures Neural Computation 7 219-269
[5]  
Tarjan R. E.(2003)Least-squares policy iteration Journal of Machine Learning Research 4 1107-1149
[6]  
Girosi F.(2007)Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning Robotics and Autonomous Systems 36 37-51
[7]  
Jones M.(2002)Statistical learning for humanoid robots Autonomous Robot 12 55-69
[8]  
Poggio T.(undefined)undefined undefined undefined undefined-undefined
[9]  
Lagoudakis M. G.(undefined)undefined undefined undefined undefined-undefined
[10]  
Parr R.(undefined)undefined undefined undefined undefined-undefined