Risk-Sensitive Mobile Robot Navigation in Crowded Environment via Offline Reinforcement Learning

被引：0

作者：

Wu, Jiaxu ^{[1
]}

Wang, Yusheng ^{[1
]}

Asama, Hajime ^{[1
]}

An, Qi ^{[2
]}

Yamashita, Atsushi ^{[2
]}

机构：

[1] Univ Tokyo, Grad Sch Engn, Dept Precis Engn, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1138656, Japan

[2] Univ Tokyo, Grad Sch Frontier Sci, Dept Human & Engn Environm Studies, 5-1-5 Kashiwanoha, Kashiwa, Chiba 2778563, Japan

来源：

2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2023年

关键词：

COLLISION-AVOIDANCE;

D O I：

10.1109/IROS55552.2023.10341948

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Mobile robot navigation in a human-populated environment has been of great interest to the research community in recent years, referred to as crowd navigation. Currently, offline reinforcement learning (RL)-based method has been introduced to this domain, for its ability to alleviate the sim2real gap brought by online RL which relies on simulators to execute training, and its scalability to use the same dataset to train for differently customized rewards. However, the performance of the navigation policy suffered from the distributional shift between the training data and the input during deployment, since when it gets an input out of the training data distribution, the learned policy has the risk of choosing an erroneous action that leads to catastrophic failure such as colliding with a human. To realize risk sensitivity and improve the safety of the offline RL agent during deployment, this work proposes a multipolicy control framework that combines offline RL navigation policy with a risk detector and a force-based risk-avoiding policy. In particular, a Lyapunov density model is learned using the latent feature of the offline RL policy and works as a risk detector to switch the control to the risk-avoiding policy when the robot has a tendency to go out of the area supported by the training data. Experimental results showed that the proposed method was able to learn navigation in a crowded scene from the offline trajectory dataset and the risk detector substantially reduces the collision rate of the vanilla offline RL agent while maintaining the navigation efficiency outperforming the state-of-the-art methods.

引用

页码：7456 / 7462

页数：7

共 47 条

[21] Mapless navigation via Hierarchical Reinforcement Learning with memory-decaying novelty
Gao, Yan
Lin, Feiqiang
Cai, Boliang
Wu, Jing
Wei, Changyun
Grech, Raphael
Ji, Ze
ROBOTICS AND AUTONOMOUS SYSTEMS, 2024, 182
[22] Multi-Agent Deep Reinforcement Learning for UAVs Navigation in Unknown Complex Environment
Xue, Yuntao
Chen, Weisheng
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 2290 - 2303
[23] Memory-based crowd-aware robot navigation using deep reinforcement learning
Sunil Srivatsav Samsani
Husna Mutahira
Mannan Saeed Muhammad
Complex & Intelligent Systems, 2023, 9 : 2147 - 2158
[24] Vision-Based Autonomous Navigation Approach for a Tracked Robot Using Deep Reinforcement Learning
Ejaz, Muhammad Mudassir
Tang, Tong Boon
Lu, Cheng-Kai
IEEE SENSORS JOURNAL, 2021, 21 (02) : 2230 - 2240
[25] Real-time navigation of mecanum wheel-based mobile robot in a dynamic environment
Shafiq, Muhammad Umair
Imran, Abid
Maznoor, Sajjad
Majeed, Afraz Hussain
Ahmed, Bilal
Khan, Ilyas
Mohamed, Abdullah
HELIYON, 2024, 10 (05)
[26] Memory-based crowd-aware robot navigation using deep reinforcement learning
Samsani, Sunil Srivatsav
Mutahira, Husna
Muhammad, Mannan Saeed
COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (02) : 2147 - 2158
[27] A Deep Reinforcement Learning Method for Mobile Robot Collision Avoidance based on Double DQN
Xue, Xidi
Li, Zhan
Zhang, Dongsheng
Yan, Yingxin
2019 IEEE 28TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2019, : 2131 - 2136
[28] Deep Reinforcement Learning for Vision-Based Navigation of UAVs in Avoiding Stationary and Mobile Obstacles
Kalidas, Amudhini P.
Joshua, Christy Jackson
Md, Abdul Quadir
Basheer, Shakila
Mohan, Senthilkumar
Sakri, Sapiah
DRONES, 2023, 7 (04)
[29] Goal-Conditioned Reinforcement Learning within a Human-Robot Disassembly Environment
Elguea-Aguinaco, Inigo
Serrano-Munoz, Antonio
Chrysostomou, Dimitrios
Inziarte-Hidalgo, Ibai
Bogh, Simon
Arana-Arexolaleiba, Nestor
APPLIED SCIENCES-BASEL, 2022, 12 (22):
[30] Deep Reinforcement Learning based Robot Navigation in Dynamic Environments using Occupancy Values of Motion Primitives
Akmandor, Neset Unver
Li, Hongyu
Lvov, Gary
Dusel, Eric
Padir, Taskin
2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 11687 - 11694

← 1 2 3 4 5 →