FISAR: Forward Invariant Safe Reinforcement Learning with a Deep Neural Network-Based Optimizer

被引：5

作者：

Sun, Chuangchuang ^{[1
]}

Kim, Dong-Ki ^{[1
]}

How, Jonathan P. ^{[1
]}

机构：

[1] MIT, Lab Informat & Decis Syst LIDS, 77 Massachusetts Ave, Cambridge, MA 02139 USA

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021) | 2021年

关键词：

ALGORITHMS;

D O I：

10.1109/ICRA48506.2021.9561147

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper investigates reinforcement learning with constraints, which are indispensable in safety-critical environments. To drive the constraint violation to decrease monotonically, we take the constraints as Lyapunov functions and impose new linear constraints on the policy parameters' updating dynamics. As a result, the original safety set can be forward-invariant. However, because the new guaranteed-feasible constraints are imposed on the updating dynamics instead of the original policy parameters, classic optimization algorithms are no longer applicable. To address this, we propose to learn a generic deep neural network (DNN)-based optimizer to optimize the objective while satisfying the linear constraints. The constraint-satisfaction is achieved via projection onto a polytope formulated by multiple linear inequality constraints, which can be solved analytically with our newly designed metric. To the best of our knowledge, this is the first DNN-based optimizer for constrained optimization with the forward invariance guarantee. We show that our optimizer trains a policy to decrease the constraint violation and maximize the cumulative reward monotonically. Results on numerical constrained optimization and obstacle-avoidance navigation validate the theoretical findings.

引用

页码：10617 / 10624

页数：8

共 36 条

[31]

Sutton RS, 2018, ADAPT COMPUT MACH LE, P1

[32]

Sutton RS, 2000, ADV NEUR IN, V12, P1057

[33]

Turchetta M, 2020, ADV NEUR IN, V33

[34]

Wichrowska O., 2017, ARXIV170304813

[35]

Yu M., 2019, ADV NEUR IN, V32

[36]

Zhang B., 2020, ARXIV200309488

← 1 2 3 4 →