Safe Reinforcement Learning for an Energy-Efficient Driver Assistance System

被引：3

作者：

Hailemichael, Habtamu ^{[1
]}

Ayalew, Beshah ^{[1
]}

Kerbel, Lindsey ^{[1
]}

Ivanco, Andrej ^{[2
]}

Loiselle, Keith ^{[2
]}

机构：

[1] Clemson Univ, Automot Engn, Greenville, SC 29607 USA

[2] Allison Transmiss Inc, One Allison Way, Indianapolis, IN 46222 USA

来源：

IFAC PAPERSONLINE | 2022年 / 55卷 / 37期

关键词：

RL driver-assist; Safe reinforcement learning; Safety filtering; Control barrier functions;

D O I：

10.1016/j.ifacol.2022.11.250

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement learning (RL)-based driver assistance systems seek to improve fuel consumption via continual improvement of powertrain control actions considering experiential data from the field. However, the need to explore diverse experiences in order to learn optimal policies often limits the application of RL techniques in safety-critical systems like vehicle control. In this paper, an exponential control barrier function (ECBF) is derived and utilized to filter unsafe actions proposed by an RL-based driver assistance system. The RL agent freely explores and optimizes the performance objectives while unsafe actions are projected to the closest actions in the safe domain. The reward is structured so that driver's acceleration requests are met in a manner that boosts fuel economy and doesn't compromise comfort. The optimal gear and traction torque control actions that maximize the cumulative reward are computed via the Maximum a Posteriori Policy Optimization (MPO) algorithm configured for a hybrid action space. The proposed safe-RL scheme is trained and evaluated in car following scenarios where it is shown that it effectively avoids collision both during training and evaluation while delivering on the expected fuel economy improvements for the driver assistance system. Copyright (c) 2022 The Authors. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)

引用

页码：615 / 620

页数：6

共 24 条

[1]

Abdolmaleki A, 2018, Arxiv, DOI arXiv:1806.06920

[2] DENUMERABLE CONSTRAINED MARKOV DECISION-PROCESSES AND FINITE APPROXIMATIONS [J].

ALTMAN, E .

MATHEMATICS OF OPERATIONS RESEARCH, 1994, 19 (01) :169-191

[3]

Ames AD, 2019, 2019 18TH EUROPEAN CONTROL CONFERENCE (ECC), P3420, DOI [10.23919/ECC.2019.8796030, 10.23919/ecc.2019.8796030]

[4]

Ames AD, 2014, IEEE DECIS CONTR P, P6271, DOI 10.1109/CDC.2014.7040372

[5] Eco-driving: An overlooked climate change initiative [J].

Barkenbus, Jack N. .

ENERGY POLICY, 2010, 38 (02) :762-769

[6]

Barlow T.J., 2009, TRL Published Project Report

[7]

Bureau of Transportation Statistics, 2017, FREIGHT AN FRAM VERS

[8]

Cheng R, 2019, AAAI CONF ARTIF INTE, P3387

[9]

Dalal G., 2018, Safe exploration in continuous action spaces

[10]

Kerbel L., 2022, Driver Assistance Eco-driving and Transmission Control with Deep Reinforcement Learning

← 1 2 3 →