Mapless Navigation with Deep Reinforcement Learning based on The Convolutional Proximal Policy Optimization Network

被引：14

作者：

Toan, Nguyen Duc ^{[1
]}

Woo, Kim Gon ^{[2
]}

机构：

[1] Dept Control & Robot Engn, Cheongju, Chungbuk, South Korea

[2] Dept Elect Engn, Cheongju, Chungbuk, South Korea

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2021) | 2021年

关键词：

reinforcement learning; mapless navigation; proximal policy optimization; SYSTEM;

D O I：

10.1109/BigComp51126.2021.00063

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In recent years, mapless navigation with a deep reinforcement learning approach in Autonomous Mobile Robot has shown a considerable benefit in improving robot behavior flexibility. Specifically, the Robot adapts to complex constraints and performs well in various environments without the need for a predetermined map and route plan. However, several previous studies show a lack of stability in the training of deep reinforcement learning networks for mapless navigation. Moreover, the exploration and exploitation in a specific work environment play an essential role in improving mapless navigation performance, which needs to be carefully considered. From the aforementioned issues, as well as inspired by the Proximal Policy Optimization algorithm has shown that it does not only evaluates the advantages and disadvantages of policy but also prevents the policy from changing too much after each time weight update. In this paper, we propose to build a Convolutional Proximal Policy Optimization network for the Mapless Navigation problem. Furthermore, the use of Boltzmann's policy to help balance exploration and exploitation also contributes to the Robot's ability to explore more deeply in complex environments, and the performance of the mapless navigation problem is also significantly improved.

引用

页码：298 / 301

页数：4

共 12 条

[1]

[Anonymous], 2014, INTERPRETING QUALITA

[2] Goal-Oriented Obstacle Avoidance with Deep Reinforcement Learning in Continuous Action Space [J].

Cimurs, Reinis ;

Lee, Jin Han ;

Suh, Il Hong .

ELECTRONICS, 2020, 9 (03)

[3]

Lillicrap T. P., 2015, P INT C LEARN REPR, P1

[4]

Mnih V., 2016, PROC AAAI C ARTIF IN, P1928

[5] Human-level control through deep reinforcement learning [J].

Mnih, Volodymyr ;

Kavukcuoglu, Koray ;

Silver, David ;

Rusu, Andrei A. ;

Veness, Joel ;

Bellemare, Marc G. ;

Graves, Alex ;

Riedmiller, Martin ;

Fidjeland, Andreas K. ;

Ostrovski, Georg ;

Petersen, Stig ;

Beattie, Charles ;

Sadik, Amir ;

Antonoglou, Ioannis ;

King, Helen ;

Kumaran, Dharshan ;

Wierstra, Daan ;

Legg, Shane ;

Hassabis, Demis .

NATURE, 2015, 518 (7540) :529-533

[6] A Hierarchical Control System for Autonomous Driving towards Urban Challenges [J].

Nam Dinh Van ;

Sualeh, Muhammad ;

Kim, Dohyeong ;

Kim, Gon-Woo .

APPLIED SCIENCES-BASEL, 2020, 10 (10)

[7] Robust Stereo Visual Inertial Navigation System Based on Multi-Stage Outlier Removal in Dynamic Environments [J].

Nam, Dinh Van ;

Gon-Woo, Kim .

SENSORS, 2020, 20 (10)

[8]

Pan L, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1992

[9]

Schulman J., 2017, PREPRINT

[10]

Schulman J, 2015, PR MACH LEARN RES, V37, P1889

← 1 2 →