A novel approach for self-driving car in partially observable environment using life long reinforcement learning

被引:1
|
作者
Quadir, Md Abdul [1 ]
Jaiswal, Dibyanshu [1 ]
Mohan, Senthilkumar [2 ]
Innab, Nisreen [3 ]
Sulaiman, Riza [4 ]
Alaoui, Mohammed Kbiri [5 ]
Ahmadian, Ali [6 ,7 ]
机构
[1] Vellore Inst Technol, Sch Comp Sci & Engn, Chennai 600127, India
[2] Vellore Inst Technol, Sch Comp Sci Engn & Informat Syst, Vellore 632014, Tamilnadu, India
[3] AlMaarefa Univ, Coll Appl Sci, Dept Comp Sci & Informat Syst, Riyadh, Saudi Arabia
[4] Univ Kebangsaan Malaysia, Inst Visual Informat, Bangi 43600, Malaysia
[5] King Khalid Univ, Coll Sci, Dept Math, Abha 61413, POB 9004, Saudi Arabia
[6] Mediterranea Univ Reggio Calabria, Decis Lab, Reggio Di Calabria, Italy
[7] Istanbul Okan Univ, Fac Engn & Nat Sci, Istanbul, Turkiye
关键词
Reinforcement Learning; Lifelong Learning; Self-driving car; Lifelong reinforcement learning; Partially observable Environment; POLICY; GAMES;
D O I
10.1016/j.segan.2024.101356
中图分类号
TE [石油、天然气工业]; TK [能源与动力工程];
学科分类号
0807 ; 0820 ;
摘要
Despite ground-breaking advancements in robotics, gaming, and other challenging domains, reinforcement learning still faces significant challenges in solving dynamic, open-world problems. Since reinforcement learning algorithms usually perform poorly when exposed to new tasks outside of their data distribution, continuous learning algorithms have drawn significant attention. In parallel with work on lifelong learning algorithms, there is a need for challenging environments, properly planned trials, and metrics to measure research success. In this context, a Deep Asynchronous Autonomous Learning System (DAALS) is proposed in this paper for training a selfdriving car in a partially observable environment for real-world scenarios in a continuous state-action space. To cater to three different use cases, three different algorithms were used. To train their agents for learning and upgrading discrete state policies, DAALS used the Asynchronous Advantage Stager Reviewer (AASR) algorithm. To train its agent for continuous state spaces, DAALS also uses an Extensive Deterministic Policy Gradient (EDPG) algorithm. To train the agent in a lifelong form of learning for partially observable environments, DAALS uses a Deep Deterministic Policy Gradient Novel Lifelong Learning Algorithm (DDPGNLLA). The system offers flexibility to the user to train the agents for both discrete and continuous state-action spaces. Compared to previous models in continuous state-action spaces, Deep deterministic policy gradient lifelong learning algorithm outperforms previous models by 46.09%. Furthermore, the Deep Asynchronous Autonomous System tends to outperform all previous reinforcement learning algorithms, making our proposed approach a real-world solution. As DAALS has tested on number of different environments it provides the insights on how modern Artificial Intelligence (AI) solutions can be generalized making it one of the better solutions for AI general domain problems.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] A Reinforcement Learning Integrated in Heuristic search method for self-driving vehicle using blockchain in supply chain management
    Nasurudeen Ahamed N.
    Karthikeyan P.
    International Journal of Intelligent Networks, 2020, 1 : 92 - 101
  • [22] Real-Time Self-Driving Car Navigation Using Deep Neural Network
    Truong-Dong Do
    Minh-Thien Duong
    Quoc-Vu Dang
    My-Ha Le
    PROCEEDINGS OF 2018 4TH INTERNATIONAL CONFERENCE ON GREEN TECHNOLOGY AND SUSTAINABLE DEVELOPMENT (GTSD), 2018, : 7 - 12
  • [23] Towards Self-driving Car Using Convolutional Neural Network and Road Lane Detector
    Nugraha, Brilian Tafjira
    Su, Shun-Feng
    Fahmizal
    PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON AUTOMATION, COGNITIVE SCIENCE, OPTICS, MICRO ELECTRO-MECHANICAL SYSTEM, AND INFORMATION TECHNOLOGY (ICACOMIT), 2017, : 65 - 69
  • [24] Abstraction in Model Based Partially Observable Reinforcement Learning using Extended Sequence Trees
    Cilden, Erkin
    Polat, Faruk
    2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 2, 2012, : 348 - 355
  • [25] A gradient-based reinforcement learning approach to dynamic pricing in partially-observable environments
    Vengerov, David
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2008, 24 (07): : 687 - 693
  • [26] Learning what to memorize: Using intrinsic motivation to form useful memory in partially observable reinforcement learning
    Alper Demir
    Applied Intelligence, 2023, 53 : 19074 - 19092
  • [27] Learning what to memorize: Using intrinsic motivation to form useful memory in partially observable reinforcement learning
    Demir, Alper
    APPLIED INTELLIGENCE, 2023, 53 (16) : 19074 - 19092
  • [28] Scenes Segmentation in Self-driving Car Navigation System Using Neural Network Models with Attention
    Sviatov, Kirill
    Miheev, Alexander
    Kanin, Daniil
    Sukhov, Sergey
    Tronin, Vadim
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2019, PT V: 19TH INTERNATIONAL CONFERENCE, SAINT PETERSBURG, RUSSIA, JULY 14, 2019, PROCEEDINGS, PART V, 2019, 11623 : 278 - 289
  • [29] Reinforcement Learning for Partially Observable Linear Gaussian Systems Using Batch Dynamics of Noisy Observations
    Yaghmaie, Farnaz Adib
    Modares, Hamidreza
    Gustafsson, Fredrik
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (09) : 6397 - 6404
  • [30] Ad Hoc-Obstacle Avoidance-Based Navigation System Using Deep Reinforcement Learning for Self-Driving Vehicles
    Manikandan, N. S.
    Kaliyaperumal, Ganesan
    Wang, Yong
    IEEE ACCESS, 2023, 11 : 92285 - 92297