Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm

被引:49
作者
Ashraf, Nesma M. [1 ]
Mostafa, Reham R. [2 ]
Sakr, Rasha H. [1 ]
Rashad, M. Z. [1 ]
机构
[1] Mansoura Univ, Fac Comp & Informat Sci, Comp Sci Dept, Mansoura, Egypt
[2] Mansoura Univ, Fac Comp & Informat Sci, Informat Syst Dept, Mansoura, Egypt
来源
PLOS ONE | 2021年 / 16卷 / 06期
关键词
LEVEL; GAME; GO;
D O I
10.1371/journal.pone.0252754
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Deep Reinforcement Learning (DRL) enables agents to make decisions based on a well-designed reward function that suites a particular environment without any prior knowledge related to a given environment. The adaptation of hyperparameters has a great impact on the overall learning process and the learning processing times. Hyperparameters should be accurately estimated while training DRL algorithms, which is one of the key challenges that we attempt to address. This paper employs a swarm-based optimization algorithm, namely the Whale Optimization Algorithm (WOA), for optimizing the hyperparameters of the Deep Deterministic Policy Gradient (DDPG) algorithm to achieve the optimum control strategy in an autonomous driving control problem. DDPG is capable of handling complex environments, which contain continuous spaces for actions. To evaluate the proposed algorithm, the Open Racing Car Simulator (TORCS), a realistic autonomous driving simulation environment, was chosen to its ease of design and implementation. Using TORCS, the DDPG agent with optimized hyperparameters was compared with a DDPG agent with reference hyperparameters. The experimental results showed that the DDPG's hyperparameters optimization leads to maximizing the total rewards, along with testing episodes and maintaining a stable driving policy.
引用
收藏
页数:24
相关论文
共 67 条
  • [1] Akai Naoki., 2017, IEEE 20 INT C INTELL, P1, DOI [DOI 10.1109/ITSC.2017.8317797, 10.1109/ITSC.2017.8317797]
  • [2] Information-Centric Networking for Connected Vehicles: A Survey and Future Perspectives
    Amadeo, Marica
    Campolo, Claudia
    Molinaro, Antonella
    [J]. IEEE COMMUNICATIONS MAGAZINE, 2016, 54 (02) : 98 - 104
  • [3] [Anonymous], 2016, INT J ENG TECHNOL SC
  • [4] Chen Y, 2018, ARXIV PREPRINT ARXIV
  • [5] Chiappa S, 2017, ARXIV PREPRINT ARXIV
  • [6] Agents teaching agents: a survey on inter-agent transfer learning
    Da Silva, Felipe Leno
    Warnell, Garrett
    Costa, Anna Helena Reali
    Stone, Peter
    [J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2020, 34 (01)
  • [7] Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition
    Dahl, George E.
    Yu, Dong
    Deng, Li
    Acero, Alex
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01): : 30 - 42
  • [8] Ant colony optimization -: Artificial ants as a computational intelligence technique
    Dorigo, Marco
    Birattari, Mauro
    Stuetzle, Thomas
    [J]. IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2006, 1 (04) : 28 - 39
  • [9] Eberhart R., 1995, P ICNN 95 INT C NEUR
  • [10] Elfwing S, 2018, P GEN EV COMP C