Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles

被引:6
|
作者
Gupta, Abhishek [1 ]
Khwaja, Ahmed Shaharyar [1 ]
Anpalagan, Alagan [1 ]
Guan, Ling [1 ]
Venkatesh, Bala [1 ]
机构
[1] Ryerson Univ, Dept Elect Comp & Biomed Engn, Toronto, ON M5B 2K3, Canada
关键词
state representation learning; variational auto encoder; deep deterministic policy gradient; soft actor-critic; autonomous driving; Markov decision process; DEEP; NETWORK; MODEL;
D O I
10.3390/s20215991
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
In this paper, we propose an environment perception framework for autonomous driving using state representation learning (SRL). Unlike existing Q-learning based methods for efficient environment perception and object detection, our proposed method takes the learning loss into account under deterministic as well as stochastic policy gradient. Through a combination of variational autoencoder (VAE), deep deterministic policy gradient (DDPG), and soft actor-critic (SAC), we focus on uninterrupted and reasonably safe autonomous driving without steering off the track for a considerable driving distance. Our proposed technique exhibits learning in autonomous vehicles under complex interactions with the environment, without being explicitly trained on driving datasets. To ensure the effectiveness of the scheme over a sustained period of time, we employ a reward-penalty based system where a negative reward is associated with an unfavourable action and a positive reward is awarded for favourable actions. The results obtained through simulations on DonKey simulator show the effectiveness of our proposed method by examining the variations in policy loss, value loss, reward function, and cumulative reward for 'VAE+DDPG' and 'VAE+SAC' over the learning process.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 50 条
  • [1] Policy-Gradient Based Actor-Critic Algorithms
    Awate, Yogesh P.
    PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL III, 2009, : 505 - 509
  • [2] Soft-Robust Actor-Critic Policy-Gradient
    Derman, Esther
    Mankowitz, Daniel J.
    Mann, Timothy A.
    Mannor, Shie
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2018, : 208 - 218
  • [3] Algorithms for Variance Reduction in a Policy-Gradient Based Actor-Critic Framework
    Awate, Yogesh P.
    ADPRL: 2009 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2009, : 130 - 136
  • [4] Learning State Representation for Deep Actor-Critic Control
    Munk, Jelle
    Kober, Jens
    Babuska, Robert
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 4667 - 4673
  • [5] Safe Driving of Autonomous Vehicles through State Representation Learning
    Gupta, Abhishek
    Khwaja, Ahmed Shaharyar
    Anpalagan, Alagan
    Guan, Ling
    IWCMC 2021: 2021 17TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2021, : 260 - 265
  • [6] Lexicographic Actor-Critic Deep Reinforcement Learning for Urban Autonomous Driving
    Zhang, Hengrui
    Lin, Youfang
    Han, Sheng
    Lv, Kai
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (04) : 4308 - 4319
  • [7] An actor-critic based learning method for decision-making and planning of autonomous vehicles
    XU Can
    ZHAO WanZhong
    CHEN QingYun
    WANG ChunYan
    Science China(Technological Sciences), 2021, 64 (05) : 984 - 994
  • [8] An actor-critic based learning method for decision-making and planning of autonomous vehicles
    Xu Can
    Zhao WanZhong
    Chen QingYun
    Wang ChunYan
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2021, 64 (05) : 984 - 994
  • [9] An actor-critic based learning method for decision-making and planning of autonomous vehicles
    XU Can
    ZHAO WanZhong
    CHEN QingYun
    WANG ChunYan
    Science China(Technological Sciences), 2021, (05) : 984 - 994
  • [10] An actor-critic based learning method for decision-making and planning of autonomous vehicles
    Can Xu
    WanZhong Zhao
    QingYun Chen
    ChunYan Wang
    Science China Technological Sciences, 2021, 64 : 984 - 994