Inverse reinforcement learning for autonomous navigation via differentiable semantic mapping and planning

被引：0

作者：

Tianyu Wang

Vikas Dhiman

Nikolay Atanasov

机构：

[1] University of California,Electrical and Computer Engineering

[2] San Diego,Electrical and Computer Engineering

[3] The University of Maine,undefined

来源：

Autonomous Robots | 2023年 / 47卷

关键词：

Inverse reinforcement learning; Semantic mapping; Autonomous navigation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

This paper focuses on inverse reinforcement learning for autonomous navigation using distance and semantic category observations. The objective is to infer a cost function that explains demonstrated behavior while relying only on the expert’s observations and state-control trajectory. We develop a map encoder, that infers semantic category probabilities from the observation sequence, and a cost encoder, defined as a deep neural network over the semantic features. Since the expert cost is not directly observable, the model parameters can only be optimized by differentiating the error between demonstrated controls and a control policy computed from the cost estimate. We propose a new model of expert behavior that enables error minimization using a closed-form subgradient computed only over a subset of promising states via a motion planning algorithm. Our approach allows generalizing the learned behavior to new environments with new spatial configurations of the semantic categories. We analyze the different components of our model in a minigrid environment. We also demonstrate that our approach learns to follow traffic rules in the autonomous driving CARLA simulator by relying on semantic observations of buildings, sidewalks, and road lanes.

引用

页码：809 / 830

页数：21

共 45 条

[1]

Argall BD(2009)A survey of robot learning from demonstration Robotics and Autonomous Systems 57 469-483

[2]

Chernova S(2017)SegNet: A deep convolutional encoder-decoder architecture for image segmentation IEEE Transactions on Pattern Analysis and Machine Intelligence 39 2481-2495

[3]

Veloso M(2020)Bayesian spatial kernel smoothing for scalable dense semantic mapping IEEE Robotics and Automation Letters 5 790-797

[4]

Browning B(2013)OctoMap: An efficient probabilistic 3D mapping framework based on octrees Autonomous Robots 34 189-206

[5]

Badrinarayanan V(2015)Learning preferences for manipulation tasks from online coactive feedback The International Journal of Robotics Research 34 1296-1313

[6]

Kendall A(2020)Rewardrational (implicit) choice: A unifying formalism for reward learning Advances in Neural Information Processing Systems 33 4415-4426

[7]

Cipolla R(2011)Sampling-based algorithms for optimal motion planning The International Journal of Robotics Research 30 846-894

[8]

Gan L(2016)End-to-end training of deep visuomotor policies The Journal of Machine Learning Research 17 1334-1373

[9]

Zhang R(2019)Monocular semantic occupancy grid mapping with convolutional variational encoder–decoder networks IEEE Robotics and Automation Letters 4 445-452

[10]

Grizzle JW(2020)Imitation learning for agile autonomous driving The International Journal of Robotics Research 39 286-302

← 1 2 3 4 5 →