Pneumatic artificial muscle-driven robot control using local update reinforcement learning

被引：24

作者：

Cui, Yunduan ^{[1
]}

Matsubara, Takamitsu ^{[1
]}

Sugimoto, Kenji ^{[1
]}

机构：

[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara, Japan

来源：

ADVANCED ROBOTICS | 2017年 / 31卷 / 08期

关键词：

Smooth policy update; dynamic policy programming; robot motor learning; SEARCH;

D O I：

10.1080/01691864.2016.1274680

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

In this study, a new value function based Reinforcement learning (RL) algorithm, Local Update Dynamic Policy Programming (LUDPP), is proposed. It exploits the nature of smooth policy update using Kullback-Leibler divergence to update its value function locally and considerably reduces the computational complexity. We firstly investigated the learning performance of LUDPP and other algorithms without smooth policy update for tasks of pendulum swing up and n DOFs manipulator reaching in simulation. Only LUDPP could efficiently and stably learn good control policies in high dimensional systems with limited number of training samples. In real word application, we applied LUDPP to control Pneumatic Artificial Muscles (PAMs) driven robots without the knowledge of model which is challenging for traditional methods due to the high nonlinearities of PAM's air pressure dynamics and mechanical structure. LUDPP successfully achieved one finger control of Shadow Dexterous Hand, a PAM-driven humanoid robot hand, with far lower computational resource compared with other conventional value function based RL algorithms.

引用

页码：397 / 412

页数：16

共 34 条

[1] Andoni Alexandr, 2009, THESIS
[2] [Anonymous], 2008, An introduction to numerical analysis
[3] [Anonymous], SHADOW DEXTROUS HAND
[4] [Anonymous], 2010, Advances in Neural Information Processing Systems
[5] [Anonymous], 2006, P 20 C NEUR INF PROC
[6] Atkeson CG, 1997, ARTIF INTELL REV, V11, P11, DOI 10.1023/A:1006559212014
[7] Azar M. G., 2011, P 14 INT C ART INT S, P119
[8] Azar MG, 2012, J MACH LEARN RES, V13, P3207
[9] MULTIDIMENSIONAL BINARY SEARCH TREES USED FOR ASSOCIATIVE SEARCHING
BENTLEY, JL
[J]. COMMUNICATIONS OF THE ACM, 1975, 18 (09) : 509 - 517
[10] Busoniu L, 2010, P AMER CONTR CONF, P486

← 1 2 3 4 →