Trajectory optimization using reinforcement, learning for map exploration

被引：90

作者：

Kollar, Thomas ^{[1
]}

Roy, Nicholas ^{[1
]}

机构：

[1] MIT, Stata Ctr, Comp Sci & Artificial Intelligent Lab, Cambridge, MA 02139 USA

来源：

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH | 2008年 / 27卷 / 02期

关键词：

reinforcement learning; trajectory optimization; exploration;

D O I：

10.1177/0278364907087426

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Automatically building maps from sensor data is a necessary and fundamental skill for mobile robots; as a result, considerable research attention has focused on the technical challenges inherent in the mapping problem. While statistical inference techniques have led to computationally efficient mapping algorithms, the next major challenge in robotic mapping is to automate the data collection process. In this paper, we address the problem of how a robot should plan to explore an unknown environment and collect data in order to maximize the accuracy of the resulting map. We formulate exploration as a constrained optimization problem and use reinforcement learning to find trajectories that lead to accurate maps. We demonstrate this process in simulation and show that the learned policy not only results in improved map building, but that the learned policy also transfers successfully to a real robot exploring on MIT campus.

引用

页码：175 / 196

页数：22

共 60 条

[1] Path planning for robotic demining: Robust sensor-based coverage of unstructured environments and probabilistic methods [J].

Acar, EU ;

Choset, H ;

Zhang, YG ;

Schervish, M .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2003, 22 (7-8) :441-466

[2]

AERT F, 2006, INT J ROBOT RES, V25, P1181

[3]

[Anonymous], 2006, THESIS U FREIBURG

[4]

[Anonymous], 1971, THESIS I OPERATIONS

[5]

[Anonymous], P IEEE INT C ROB AUT

[6]

[Anonymous], 2001, P INT C MACH LEARN

[7]

BAGNELL JD, 2003, NEURAL INFORM PROCES, V16

[8] NUMERICAL POTENTIAL-FIELD TECHNIQUES FOR ROBOT PATH PLANNING [J].

BARRAQUAND, J ;

LANGLOIS, B ;

LATOMBE, JC .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1992, 22 (02) :224-241

[9] Simultaneous localization and map building in large-scale cyclic environments using the Atlas framework [J].

Bosse, M ;

Newman, P ;

Leonard, J ;

Teller, S .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2004, 23 (12) :1113-1139

[10]

BOURGAULT F, 2002, P IEEE RSJ INT C INT

← 1 2 3 4 5 6 →