Autonomous navigation of mobile robots in unknown environments using off-policy reinforcement learning with curriculum learning

被引：10

作者：

Yin, Yan ^{[1
]}

Chen, Zhiyu ^{[1
,2
]}

Liu, Gang ^{[1
,2
]}

Yin, Jiasong ^{[1
]}

Guo, Jianwei ^{[1
,2
]}

机构：

[1] Changchun Univ Technol, Sch Comp Sci & Engn, Changchun 130012, Peoples R China

[2] Jilin Prov Data Serv Ind Publ Technol Res Ctr, Changchun, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 247卷

关键词：

Soft actor critic (SAC); CEP; Trajectory energy; Curriculum learning; Fuzzy logic control; Sampling efficiency; VISUAL NAVIGATION;

D O I：

10.1016/j.eswa.2024.123202

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning (RL) is effective for autonomous navigation tasks without prior knowledge of the environment. However, traditional mobile robot navigation algorithms, based on off -policy RL, often face challenges such as low sample efficiency during training and lack of adequate safety mechanisms. In this paper, we present an off -policy RL navigation model named Soft Actor -Critic with Curriculum Prioritization and Fuzzy Logic (SCF). The model uses energy as a prioritized evaluation metric for experience replay. And through task -level curriculum, the agent's learning sequence is formulated, thereby enhancing sampling efficiency and safety. We propose a Curriculum -based Energy Prioritization (CEP) approach. It selects a replay trajectory that matches the current agent's capability based on trajectory energy. Our results show that robots using off -policy RL often have limitations in dynamic obstacle avoidance. To rectify this, our model uses a fuzzy logic controller to enhance real-time obstacle avoidance. The SCF approach enables mobile robots to navigate adeptly in unpredictable and dynamic environments, ensuring optimal planning control while being safe and robust. Experiments in Gazebo simulation environment and real world confirm the effectiveness of our proposed method. The comparison results show the superior performance of this method, especially in unknown and dynamic environments.

引用

页数：18

共 50 条

[21] On the Reuse Bias in Off-Policy Reinforcement Learning
Ying, Chengyang
Hao, Zhongkai
Zhou, Xinning
Su, Hang
Yan, Dong
Zhu, Jun
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 4513 - 4521
[22] A perspective on off-policy evaluation in reinforcement learning
Lihong Li
Frontiers of Computer Science, 2019, 13 : 911 - 912
[23] Reliable Off-Policy Evaluation for Reinforcement Learning
Wang, Jie
Gao, Rui
Zha, Hongyuan
OPERATIONS RESEARCH, 2024, 72 (02) : 699 - 716
[24] Sequential Search with Off-Policy Reinforcement Learning
Miao, Dadong
Wang, Yanan
Tang, Guoyu
Liu, Lin
Xu, Sulong
Long, Bo
Xiao, Yun
Wu, Lingfei
Jiang, Yunjiang
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 4006 - 4015
[25] Representations for Stable Off-Policy Reinforcement Learning
Ghosh, Dibya
Bellemare, Marc G.
25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[26] Off-Policy Differentiable Logic Reinforcement Learning
Zhang, Li
Li, Xin
Wang, Mingzhong
Tian, Andong
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT II, 2021, 12976 : 617 - 632
[27] Marginalized Operators for Off-policy Reinforcement Learning
Tang, Yunhao
Rowland, Mark
Munos, Remi
Valko, Michal
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 655 - 679
[28] Off-Policy Shaping Ensembles in Reinforcement Learning
Harutyunyan, Anna
Brys, Tim
Vrancx, Peter
Nowe, Ann
21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1021 - 1022
[29] Switching control of mobile robots for autonomous navigation in unknown environments
Toibero, J. A.
Carelli, R.
Kuchen, B.
PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-10, 2007, : 1974 - +
[30] Autonomous navigation of a mobile robot in dynamic indoor environments using SLAM and reinforcement learning
Chewu, C. C. E.
Kumar, V. Manoj
2ND INTERNATIONAL CONFERENCE ON ADVANCES IN MECHANICAL ENGINEERING (ICAME 2018), 2018, 402

← 1 2 3 4 5 →