Approximating a deep reinforcement learning docking agent using linear model trees

被引：0

作者：

Gjaerum, Vilde B. ^{[1
]}

Rorvik, Ella-Lovise H. ^{[2
]}

Lekkas, Anastasios M. ^{[1
]}

机构：

[1] Norwegian Univ Sci & Technol, Dept Engn Cybernet, Trondheim, Norway

[2] TronderEnergi, Dept Artificial Intelligence, Trondheim, Norway

来源：

2021 EUROPEAN CONTROL CONFERENCE (ECC) | 2021年

关键词：

Deep Reinforcement Learning; Explainable Artificial Intelligence; Linear Model Trees; Docking; Berthing; Autonomous Surface Vessel;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep reinforcement learning has led to numerous notable results in robotics. However, deep neural networks (DNNs) are unintuitive, which makes it difficult to understand their predictions and strongly limits their potential for real-world applications due to economic, safety, and assurance reasons. To remedy this problem, a number of explainable AI methods have been presented, such as SHAP and LIME, but these can be either be too costly to be used in real-time robotic applications or provide only local explanations. In this paper, the main contribution is the use of a linear model tree (LMT) to approximate a DNN policy, originally trained via proximal policy optimization(PPO), for an autonomous surface vehicle with five control inputs performing a docking operation. The two main benefits of the proposed approach are: a) LMTs are transparent which makes it possible to associate directly the outputs (control actions, in our case) with specific values of the input features, b) LMTs are computationally efficient and can provide information in real-time. In our simulations, the opaque DNN policy controls the vehicle and the LMT runs in parallel to provide explanations in the form of feature attributions. Our results indicate that LMTs can be a useful component within digital assurance frameworks for autonomous ships.

引用

页码：1465 / 1471

页数：7

共 18 条

[1]

Baker B., 2020, INT C LEARN REPR

[2] Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI [J].

Barredo Arrieta, Alejandro ;

Diaz-Rodriguez, Natalia ;

Del Ser, Javier ;

Bennetot, Adrien ;

Tabik, Siham ;

Barbado, Alberto ;

Garcia, Salvador ;

Gil-Lopez, Sergio ;

Molina, Daniel ;

Benjamins, Richard ;

Chatila, Raja ;

Herrera, Francisco .

INFORMATION FUSION, 2020, 58 :82-115

[3]

Breiman L., 2017, Classification and Regression Trees, DOI 10.1201/9781315139470

[4]

Glomsrud J., 2019, INT SEM SAF SEC AUT

[5]

Haarnoja T., 2018, P ROBOTICS SCI SYSTE, DOI DOI 10.15607/RSS.2019.XV.011

[6]

Kumar I. E., 2020, P INT C MACH LEARN, P8083

[7]

Lundberg SM, 2017, ADV NEUR IN, V30

[8] Straight-Path Following for Underactuated Marine Vessels using Deep Reinforcement Learning [J].

Martinsen, Andreas B. ;

Lekkas, Anastasios M. .

IFAC PAPERSONLINE, 2018, 51 (29) :329-334

[9] COLREG-Compliant Collision Avoidance for Unmanned Surface Vehicle Using Deep Reinforcement Learning [J].

Meyer, Eivind ;

Heiberg, Amalie ;

Rasheed, Adil ;

San, Omer .

IEEE ACCESS, 2020, 8 :165344-165364

[10]

Murphy KP, 2012, MACHINE LEARNING: A PROBABILISTIC PERSPECTIVE, P1

← 1 2 →