Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles

被引：6

作者：

Gupta, Abhishek ^{[1
]}

Khwaja, Ahmed Shaharyar ^{[1
]}

Anpalagan, Alagan ^{[1
]}

Guan, Ling ^{[1
]}

Venkatesh, Bala ^{[1
]}

机构：

[1] Ryerson Univ, Dept Elect Comp & Biomed Engn, Toronto, ON M5B 2K3, Canada

来源：

SENSORS | 2020年 / 20卷 / 21期

关键词：

state representation learning; variational auto encoder; deep deterministic policy gradient; soft actor-critic; autonomous driving; Markov decision process; DEEP; NETWORK; MODEL;

D O I：

10.3390/s20215991

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

In this paper, we propose an environment perception framework for autonomous driving using state representation learning (SRL). Unlike existing Q-learning based methods for efficient environment perception and object detection, our proposed method takes the learning loss into account under deterministic as well as stochastic policy gradient. Through a combination of variational autoencoder (VAE), deep deterministic policy gradient (DDPG), and soft actor-critic (SAC), we focus on uninterrupted and reasonably safe autonomous driving without steering off the track for a considerable driving distance. Our proposed technique exhibits learning in autonomous vehicles under complex interactions with the environment, without being explicitly trained on driving datasets. To ensure the effectiveness of the scheme over a sustained period of time, we employ a reward-penalty based system where a negative reward is associated with an unfavourable action and a positive reward is awarded for favourable actions. The results obtained through simulations on DonKey simulator show the effectiveness of our proposed method by examining the variations in policy loss, value loss, reward function, and cumulative reward for 'VAE+DDPG' and 'VAE+SAC' over the learning process.

引用

页码：1 / 20

页数：20

共 50 条

[1] Policy-Gradient Based Actor-Critic Algorithms
Awate, Yogesh P.
PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL III, 2009, : 505 - 509
[2] Soft-Robust Actor-Critic Policy-Gradient
Derman, Esther
Mankowitz, Daniel J.
Mann, Timothy A.
Mannor, Shie
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2018, : 208 - 218
[3] Algorithms for Variance Reduction in a Policy-Gradient Based Actor-Critic Framework
Awate, Yogesh P.
ADPRL: 2009 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2009, : 130 - 136
[4] Learning State Representation for Deep Actor-Critic Control
Munk, Jelle
Kober, Jens
Babuska, Robert
2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 4667 - 4673
[5] Safe Driving of Autonomous Vehicles through State Representation Learning
Gupta, Abhishek
Khwaja, Ahmed Shaharyar
Anpalagan, Alagan
Guan, Ling
IWCMC 2021: 2021 17TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2021, : 260 - 265
[6] Lexicographic Actor-Critic Deep Reinforcement Learning for Urban Autonomous Driving
Zhang, Hengrui
Lin, Youfang
Han, Sheng
Lv, Kai
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (04) : 4308 - 4319
[7] An actor-critic based learning method for decision-making and planning of autonomous vehicles
XU Can
ZHAO WanZhong
CHEN QingYun
WANG ChunYan
Science China(Technological Sciences), 2021, 64 (05) : 984 - 994
[8] An actor-critic based learning method for decision-making and planning of autonomous vehicles
Xu Can
Zhao WanZhong
Chen QingYun
Wang ChunYan
SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2021, 64 (05) : 984 - 994
[9] An actor-critic based learning method for decision-making and planning of autonomous vehicles
XU Can
ZHAO WanZhong
CHEN QingYun
WANG ChunYan
Science China(Technological Sciences), 2021, (05) : 984 - 994
[10] An actor-critic based learning method for decision-making and planning of autonomous vehicles
Can Xu
WanZhong Zhao
QingYun Chen
ChunYan Wang
Science China Technological Sciences, 2021, 64 : 984 - 994

← 1 2 3 4 5 →