Causal action empowerment for efficient reinforcement learning in embodied agents

被引：0

作者：

Hongye Cao ^{[1
]}

Fan Feng ^{[2
]}

Jing Huo ^{[1
]}

Yang Gao ^{[1
]}

机构：

[1] National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing

[2] Department of Electrical Engineering, City University of Hong Kong

来源：

Science China Information Sciences | 2025年 / 68卷 / 5期

基金：

中国国家自然科学基金;

关键词：

causal; embodied agents; empowerment; reinforcement learning; sample efficiency;

D O I：

10.1007/s11432-024-4396-3

中图分类号：

学科分类号：

摘要：

Reinforcement learning (RL) has been widely adopted for intelligent decision-making in embodied agents due to its effective trial-and-error learning capabilities. However, most RL methods overlook the causal relationships among states and actions during policy exploration and lack the human-like ability to distinguish signal from noise and reason with important abstractions, resulting in poor sample efficiency. To address this issue, we propose a novel method named causal action empowerment (CAE) for efficient RL, designed to improve sample efficiency in policy learning for embodied agents. CAE identifies and leverages causal relationships among states, actions, and rewards to extract controllable state variables and reweight actions for prioritizing high-impact behaviors. Moreover, by integrating a causality-aware empowerment term, CAE significantly enhances an embodied agent’s execution of causally-aware behavior for more efficient exploration via boosting controllability in complex embodied environments. Benefiting from these two improvements, CAE bridges the gap between local causal discovery and global causal empowerment. To comprehensively evaluate the effectiveness of CAE, we conduct extensive experiments across 25 tasks in 5 diverse embodied environments, encompassing both locomotion and manipulation skill learning with dense and sparse reward settings. Experimental results demonstrate that CAE consistently outperforms existing methods across this wide range of scenarios, offering a promising avenue for improving sample efficiency in RL. © Science China Press 2025.

引用

共 47 条

[1] Gupta A., Savarese S., Ganguli S., Et al., Embodied intelligence via learning and evolution, Nat Commun, 12, (2021)
[2] Li J., Wang X., Tang S., Et al., Unsupervised reinforcement learning of transferable meta-skills for embodied navigation, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12123-12132, (2020)
[3] Cao H., Yang S., Huo J., Et al., Enhancing OOD generalization in offline reinforcement learning with energy-based policy optimization, Proceedings of the 26th European Conference on Artificial Intelligence, pp. 335-342, (2023)
[4] Ze Y., Liu Y., Shi R., Et al., H-InDex: visual reinforcement learning with hand-informed representations for dexterous manipulation, Proceedings of Advances in Neural Information Processing Systems, (2024)
[5] Savva M., Kadian A., Maksymets O., Et al., Habitat: a platform for embodied AI research, Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9339-9347, (2019)
[6] Kroemer O., Niekum S., Konidaris G., A review of robot learning for manipulation: challenges, representations, and algorithms, J Machine Learn Res, 22, pp. 1395-1476, (2021)
[7] Zeng Y., Cai R., Sun F., Et al., A survey on causal reinforcement learning. 2023
[8] Gupta T., Gong W., Ma C., Et al., The essential role of causality in foundation world models for embodied AI. 2024
[9] Cao H., Feng F., Fang M., Et al., Towards empowerment gain through causal structure learning in model-based RL. 2025
[10] Cao H., Feng F., Yang T., Et al., Causal information prioritization for efficient reinforcement learning. 2025

← 1 2 3 4 5 →