Socially Adaptive Path Planning in Human Environments Using Inverse Reinforcement Learning

被引:3
作者
Beomjoon Kim
Joelle Pineau
机构
[1] McGill University,School of Computer Science
来源
International Journal of Social Robotics | 2016年 / 8卷
关键词
Navigation; Obstacle avoidance; RGB-D optical flow ; Learning from demonstration; Inverse reinforcement learning;
D O I
暂无
中图分类号
学科分类号
摘要
A key skill for mobile robots is the ability to navigate efficiently through their environment. In the case of social or assistive robots, this involves navigating through human crowds. Typical performance criteria, such as reaching the goal using the shortest path, are not appropriate in such environments, where it is more important for the robot to move in a socially adaptive manner such as respecting comfort zones of the pedestrians. We propose a framework for socially adaptive path planning in dynamic environments, by generating human-like path trajectory. Our framework consists of three modules: a feature extraction module, inverse reinforcement learning (IRL) module, and a path planning module. The feature extraction module extracts features necessary to characterize the state information, such as density and velocity of surrounding obstacles, from a RGB-depth sensor. The inverse reinforcement learning module uses a set of demonstration trajectories generated by an expert to learn the expert’s behaviour when faced with different state features, and represent it as a cost function that respects social variables. Finally, the planning module integrates a three-layer architecture, where a global path is optimized according to a classical shortest-path objective using a global map known a priori, a local path is planned over a shorter distance using the features extracted from a RGB-D sensor and the cost function inferred from IRL module, and a low-level system handles avoidance of immediate obstacles. We evaluate our approach by deploying it on a real robotic wheelchair platform in various scenarios, and comparing the robot trajectories to human trajectories.
引用
收藏
页码:51 / 66
页数:15
相关论文
共 31 条
[1]  
Abbeel P(2010)Autonomous helicopter aerobatics through apprenticeship learning Int J Robot Res 29 1608-1639
[2]  
Coates A(1957)A markovian decision process J Math Mech 6 679-684
[3]  
Ng AY(2005)Learning motion patterns of people for compliant robot motion Int J Robot Res 24 31-48
[4]  
Bellman R(2005)Lucas/kanade meets horn/schunck: combining local and global optic flow methods Int J Comput Vis 61 211-231
[5]  
Bennewitz M(2001)Variable selection via nonconcave penalized likelihood and its oracle properties J Am Stat Assoc 96 1348-1360
[6]  
Burgard W(2010)Probabilistic autonomous robot navigation in dynamic environments with human motion prediction Int J Soc Robot 2 79-94
[7]  
Cielniak G(1997)The dynamic window approach to collision avoidance Robot Autom Mag 4 23-33
[8]  
Thrun S(1981)Determining optical flow Artif Intell 17 185-203
[9]  
Bruhn A(2013)Human-aware robot navigation: a survey Robot Auton Syst 61 1726-1743
[10]  
Weickert J(2007)A human aware mobile robot motion planner IEEE Trans Robot 23 874-883