Learning Control for Air Hockey Striking using Deep Reinforcement Learning

被引：6

作者：

Taitler, Ayal ^{[1
]}

Shimkin, Nahum ^{[1
]}

机构：

[1] Technion Israel Inst Technol, Fac Elect Engn, IL-32000 Haifa, Israel

来源：

2017 INTERNATIONAL CONFERENCE ON CONTROL, ARTIFICIAL INTELLIGENCE, ROBOTICS & OPTIMIZATION (ICCAIRO) | 2017年

关键词：

D O I：

10.1109/ICCAIRO.2017.14

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider the task of learning control policies for a robotic mechanism striking a puck in an air hockey game. The control signal is a direct command to the robot's motors. We employ a model free deep reinforcement learning framework to learn the motoric skills of striking the puck accurately in order to score. We propose certain improvements to the standard learning scheme which make the deep Q-learning algorithm feasible when it might otherwise fail. Our improvements include integrating prior knowledge into the learning scheme, and accounting for the changing distribution of samples in the experience replay buffer. Finally we present our simulation results for aimed striking which demonstrate the successful learning of this task, and the improvement in algorithm stability due to the proposed modifications.

引用

页码：22 / 27

页数：6

共 25 条

[11] A survey of robot learning from demonstration
Argall, Brenna D.
Chernova, Sonia
Veloso, Manuela
Browning, Brett
[J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2009, 57 (05) : 469 - 483
[12] Catastrophic forgetting in connectionist networks
French, RM
[J]. TRENDS IN COGNITIVE SCIENCES, 1999, 3 (04) : 128 - 135
[13] Mnih V, 2013, ARXIV13125602
[14] Mnih V, 2016, PR MACH LEARN RES, V48
[15] Human-level control through deep reinforcement learning
Mnih, Volodymyr
Kavukcuoglu, Koray
Silver, David
Rusu, Andrei A.
Veness, Joel
Bellemare, Marc G.
Graves, Alex
Riedmiller, Martin
Fidjeland, Andreas K.
Ostrovski, Georg
Petersen, Stig
Beattie, Charles
Sadik, Amir
Antonoglou, Ioannis
King, Helen
Kumaran, Dharshan
Wierstra, Daan
Legg, Shane
Hassabis, Demis
[J]. NATURE, 2015, 518 (7540) : 529 - 533
[16] Muelling K., 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2010), P411, DOI 10.1109/ICHR.2010.5686298
[17] Nair A, 2015, ARXIV
[18] Namiki A, 2013, IEEE INT CONF ROBOT, P1187, DOI 10.1109/ICRA.2013.6630722
[19] Control of planar rigid body sliding with impacts and friction
Partridge, CB
Spong, MW
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2000, 19 (04) : 336 - 348
[20] Schaal S, 1997, ADV NEUR IN, V9, P1040

← 1 2 3 →