Geodesic Gaussian kernels for value function approximation

被引：0

作者：

Masashi Sugiyama

Hirotaka Hachiya

Christopher Towell

Sethu Vijayakumar

机构：

[1] Tokyo Institute of Technology,Department of Computer Science

[2] University of Edinburgh,School of Informatics

来源：

Autonomous Robots | 2008年 / 25卷

关键词：

Reinforcement learning; Value function approximation; Markov decision process; Least-squares policy iteration; Gaussian kernel;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The least-squares policy iteration approach works efficiently in value function approximation, given appropriate basis functions. Because of its smoothness, the Gaussian kernel is a popular and useful choice as a basis function. However, it does not allow for discontinuity which typically arises in real-world reinforcement learning tasks. In this paper, we propose a new basis function based on geodesic Gaussian kernels, which exploits the non-linear manifold structure induced by the Markov decision processes. The usefulness of the proposed method is successfully demonstrated in simulated robot arm control and Khepera robot navigation.

引用

页码：287 / 304

页数：17

共 17 条

[1]

Coifman R.(2006)Diffusion wavelets Applied and Computational Harmonic Analysis 21 53-94

[2]

Maggioni M.(1959)A note on two problems in connexion with graphs Numerische Mathematik 1 269-271

[3]

Dijkstra E. W.(1987)Fibonacci heaps and their uses in improved network optimization algorithms Journal of the ACM 34 569-615

[4]

Fredman M. L.(1995)Regularization theory and neural networks architectures Neural Computation 7 219-269

[5]

Tarjan R. E.(2003)Least-squares policy iteration Journal of Machine Learning Research 4 1107-1149

[6]

Girosi F.(2007)Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning Robotics and Autonomous Systems 36 37-51

[7]

Jones M.(2002)Statistical learning for humanoid robots Autonomous Robot 12 55-69

[8]

Poggio T.(undefined)undefined undefined undefined undefined-undefined

[9]

Lagoudakis M. G.(undefined)undefined undefined undefined undefined-undefined

[10]

Parr R.(undefined)undefined undefined undefined undefined-undefined

← 1 2 →