Risk-Sensitive Mobile Robot Navigation in Crowded Environment via Offline Reinforcement Learning

被引:0
|
作者
Wu, Jiaxu [1 ]
Wang, Yusheng [1 ]
Asama, Hajime [1 ]
An, Qi [2 ]
Yamashita, Atsushi [2 ]
机构
[1] Univ Tokyo, Grad Sch Engn, Dept Precis Engn, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1138656, Japan
[2] Univ Tokyo, Grad Sch Frontier Sci, Dept Human & Engn Environm Studies, 5-1-5 Kashiwanoha, Kashiwa, Chiba 2778563, Japan
来源
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2023年
关键词
COLLISION-AVOIDANCE;
D O I
10.1109/IROS55552.2023.10341948
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mobile robot navigation in a human-populated environment has been of great interest to the research community in recent years, referred to as crowd navigation. Currently, offline reinforcement learning (RL)-based method has been introduced to this domain, for its ability to alleviate the sim2real gap brought by online RL which relies on simulators to execute training, and its scalability to use the same dataset to train for differently customized rewards. However, the performance of the navigation policy suffered from the distributional shift between the training data and the input during deployment, since when it gets an input out of the training data distribution, the learned policy has the risk of choosing an erroneous action that leads to catastrophic failure such as colliding with a human. To realize risk sensitivity and improve the safety of the offline RL agent during deployment, this work proposes a multipolicy control framework that combines offline RL navigation policy with a risk detector and a force-based risk-avoiding policy. In particular, a Lyapunov density model is learned using the latent feature of the offline RL policy and works as a risk detector to switch the control to the risk-avoiding policy when the robot has a tendency to go out of the area supported by the training data. Experimental results showed that the proposed method was able to learn navigation in a crowded scene from the offline trajectory dataset and the risk detector substantially reduces the collision rate of the vanilla offline RL agent while maintaining the navigation efficiency outperforming the state-of-the-art methods.
引用
收藏
页码:7456 / 7462
页数:7
相关论文
共 47 条
  • [21] Mapless navigation via Hierarchical Reinforcement Learning with memory-decaying novelty
    Gao, Yan
    Lin, Feiqiang
    Cai, Boliang
    Wu, Jing
    Wei, Changyun
    Grech, Raphael
    Ji, Ze
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2024, 182
  • [22] Multi-Agent Deep Reinforcement Learning for UAVs Navigation in Unknown Complex Environment
    Xue, Yuntao
    Chen, Weisheng
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 2290 - 2303
  • [23] Memory-based crowd-aware robot navigation using deep reinforcement learning
    Sunil Srivatsav Samsani
    Husna Mutahira
    Mannan Saeed Muhammad
    Complex & Intelligent Systems, 2023, 9 : 2147 - 2158
  • [24] Vision-Based Autonomous Navigation Approach for a Tracked Robot Using Deep Reinforcement Learning
    Ejaz, Muhammad Mudassir
    Tang, Tong Boon
    Lu, Cheng-Kai
    IEEE SENSORS JOURNAL, 2021, 21 (02) : 2230 - 2240
  • [25] Real-time navigation of mecanum wheel-based mobile robot in a dynamic environment
    Shafiq, Muhammad Umair
    Imran, Abid
    Maznoor, Sajjad
    Majeed, Afraz Hussain
    Ahmed, Bilal
    Khan, Ilyas
    Mohamed, Abdullah
    HELIYON, 2024, 10 (05)
  • [26] Memory-based crowd-aware robot navigation using deep reinforcement learning
    Samsani, Sunil Srivatsav
    Mutahira, Husna
    Muhammad, Mannan Saeed
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (02) : 2147 - 2158
  • [27] A Deep Reinforcement Learning Method for Mobile Robot Collision Avoidance based on Double DQN
    Xue, Xidi
    Li, Zhan
    Zhang, Dongsheng
    Yan, Yingxin
    2019 IEEE 28TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2019, : 2131 - 2136
  • [28] Deep Reinforcement Learning for Vision-Based Navigation of UAVs in Avoiding Stationary and Mobile Obstacles
    Kalidas, Amudhini P.
    Joshua, Christy Jackson
    Md, Abdul Quadir
    Basheer, Shakila
    Mohan, Senthilkumar
    Sakri, Sapiah
    DRONES, 2023, 7 (04)
  • [29] Goal-Conditioned Reinforcement Learning within a Human-Robot Disassembly Environment
    Elguea-Aguinaco, Inigo
    Serrano-Munoz, Antonio
    Chrysostomou, Dimitrios
    Inziarte-Hidalgo, Ibai
    Bogh, Simon
    Arana-Arexolaleiba, Nestor
    APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [30] Deep Reinforcement Learning based Robot Navigation in Dynamic Environments using Occupancy Values of Motion Primitives
    Akmandor, Neset Unver
    Li, Hongyu
    Lvov, Gary
    Dusel, Eric
    Padir, Taskin
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 11687 - 11694