Hierarchical Deep Reinforcement Learning for Continuous Action Control

被引：120

作者：

Yang, Zhaoyang ^{[1
,2
]}

Merrick, Kathryn ^{[1
]}

Jin, Lianwen ^{[3
]}

Abbass, Hussein A. ^{[1
]}

机构：

[1] Univ New South Wales, Sch Engn & Informat Technol, Canberra, ACT 2612, Australia

[2] South China Univ Technol, Coll Elect & Informat Engn, Guangzhou 510641, Guangdong, Peoples R China

[3] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510641, Guangdong, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2018年 / 29卷 / 11期

基金：

澳大利亚研究理事会;

关键词：

Continuous control; deep learning; hierarchical learning; reinforcement learning; NETWORKS; GAME; GO;

D O I：

10.1109/TNNLS.2018.2805379

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Robotic control in a continuous action space has long been a challenging topic. This is especially true when controlling robots to solve compound tasks, as both basic skills and compound skills need to be learned. In this paper, we propose a hierarchical deep reinforcement learning algorithm to learn basic skills and compound skills simultaneously. In the proposed algorithm, compound skills and basic skills are learned by two levels of hierarchy. In the first level of hierarchy, each basic skill is handled by its own actor, overseen by a shared basic critic. Then, in the second level of hierarchy, compound skills are learned by a meta critic by reusing basic skills. The proposed algorithm was evaluated on a Pioneer 3AT robot in three different navigation scenarios with fully observable tasks. The simulations were built in Gazebo 2 in a robot operating system Indigo environment. The results show that the proposed algorithm can learn both high performance basic skills and compound skills through the same learning process. The compound skills learned outperform those learned by a discrete action space deep reinforcement learning algorithm.

引用

页码：5174 / 5184

页数：11

共 50 条

[1] Multi-Task Deep Reinforcement Learning for Continuous Action Control
Yang, Zhaoyang
Merrick, Kathryn
Abbass, Hussein
Jin, Lianwen
PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3301 - 3307
[2] The use of continuous action representations to scale deep reinforcement learning for inventory control
Vanvuchelen, Nathalie
De Moor, Bram J.
Boute, Robert N.
IMA JOURNAL OF MANAGEMENT MATHEMATICS, 2024, 36 (01) : 51 - 66
[3] Benchmarking Deep Reinforcement Learning for Continuous Control
Duan, Yan
Chen, Xi
Houthooft, Rein
Schulman, John
Abbeel, Pieter
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
[4] Hierarchical Deep Reinforcement Learning for cubesat guidance and control
Tammam, Abdulla
Aouf, Nabil
CONTROL ENGINEERING PRACTICE, 2025, 156
[5] Soft Action Particle Deep Reinforcement Learning for a Continuous Action Space
Kang, Minjae
Lee, Kyungjae
Oh, Songhwai
2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 5028 - 5033
[6] Action Robust Reinforcement Learning and Applications in Continuous Control
Tessler, Chen
Efroni, Yonathan
Mannor, Shie
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[7] Continuous control of a polymerization system with deep reinforcement learning
Ma, Yan
Zhu, Wenbo
Benton, Michael G.
Romagnoli, Jose
JOURNAL OF PROCESS CONTROL, 2019, 75 : 40 - 47
[8] Continuous Control in Car Simulator with Deep Reinforcement Learning
Yang, Fan
Wang, Ping
Wang, XinHong
PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 566 - 570
[9] Autoregressive Policies for Continuous Control Deep Reinforcement Learning
Korenkevych, Dmytro
Mahmood, A. Rupam
Vasan, Gautham
Bergstra, James
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2754 - 2762
[10] Deep Reinforcement Learning for Continuous Control of Material Thickness
Dippel, Oliver
Lisitsa, Alexei
Peng, Bei
ARTIFICIAL INTELLIGENCE XL, AI 2023, 2023, 14381 : 321 - 334

← 1 2 3 4 5 →