Causal deconfounding deep reinforcement learning for mobile robot motion planning

被引：2

作者：

Tang, Wenbing ^{[1
,2
]}

Wu, Fenghua ^{[2
]}

Lin, Shang-wei ^{[2
]}

Ding, Zuohua ^{[3
]}

Liu, Jing ^{[1
]}

Liu, Yang ^{[2
]}

He, Jifeng ^{[1
]}

机构：

[1] East China Normal Univ, Shanghai Key Lab Trustworthy Comp, Shanghai 200062, Peoples R China

[2] Nanyang Technol Univ, Coll Comp & Data Sci, Singapore 639798, Singapore

[3] Zhejiang Sci Tech Univ, Sch Comp Sci & Technol, Hangzhou 310018, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2024年 / 303卷

关键词：

Backdoor paths; Causal inference; Deep reinforcement learning; Mobile robots; Motion planning; MODEL;

D O I：

10.1016/j.knosys.2024.112406

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep reinforcement learning (DRL) has emerged as an efficient approach for motion planning in mobile robot systems. It leverages the offline training process to enhance real-time computation efficiency. In DRLbased methods, the DRL models are trained to compute an action based on the current state of the robot and the surrounding obstacles. However, the trained models may capture spurious correlations through potential confounders, resulting in non-robust state representations, which limits the models' robustness and generalizability. In this paper, we propose a Causal Deconfounding DRL method for Motion Planning, CD-DRL-MP, to address spurious correlations and learn robust and generalizable policies. Specifically, we formalize the temporal causal relationships between states and actions using a structural causal model. We then extract the minimal sufficient state representation set by blocking the backdoor paths in the causal model. Finally, using the representation set, CD-DRL-MP learns the causal effect between states and actions while mitigating the detrimental influence of potential confounders and computes motion commands for mobile robots. Comprehensive experiments show that the proposed method significantly outperforms non-causal DRL methods and existing causal methods, while guaranteeing good robustness and generalizability.

引用

页数：12

共 44 条

[41] Multilevel Humanlike Motion Planning for Mobile Robots in Complex Indoor Environments [J].

Zhang, Xuebo ;

Wang, Jiarui ;

Fang, Yongchun ;

Yuan, Jing .

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2019, 16 (03) :1244-1258

[42] Learning Domain Invariant Representations for Generalizable Person Re-Identification [J].

Zhang, Yi-Fan ;

Zhang, Zhang ;

Li, Da ;

Jia, Zhen ;

Wang, Liang ;

Tan, Tieniu .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 :509-523

[43] Representation Learning and Reinforcement Learning for Dynamic Complex Motion Planning System [J].

Zhou, Chengmin ;

Huang, Bingding ;

Franti, Pasi .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) :11049-11063

[44] An optimized Q-Learning algorithm for mobile robot local path planning [J].

Zhou, Qian ;

Lian, Yang ;

Wu, Jiayang ;

Zhu, Mengyue ;

Wang, Haiyong ;

Cao, Jinli .

KNOWLEDGE-BASED SYSTEMS, 2024, 286

← 1 2 3 4 5 →