Initial State Interventions for Deconfounded Imitation Learning

被引：0

作者：

Pfrommer, Samuel ^{[1
]}

Bai, Yatong ^{[1
]}

Lee, Hyunin ^{[1
]}

Sojoudi, Somayeh ^{[1
]}

机构：

[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA

来源：

2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC | 2023年

关键词：

D O I：

10.1109/CDC49753.2023.10383252

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Imitation learning suffers from causal confusion. This phenomenon occurs when learned policies attend to features that do not causally influence the expert actions but are instead spuriously correlated. Causally confused agents produce low open-loop supervised loss but poor closed-loop performance upon deployment. We consider the problem of masking observed confounders in a disentangled representation of the observation space. Our novel masking algorithm leverages the usual ability to intervene in the initial system state, avoiding any requirement involving expert querying, expert reward functions, or causal graph specification. Under certain assumptions, we theoretically prove that this algorithm is conservative in the sense that it does not incorrectly mask observations that causally influence the expert; furthermore, intervening on the initial state serves to strictly reduce excess conservatism. The masking algorithm is applied to behavior cloning for two illustrative control systems: CartPole and Reacher.

引用

页码：2312 / 2319

页数：8

共 50 条

[11] An iterative learning controller with initial state learning
Chen, Y
Wen, C
Gong, Z
Sun, M
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1999, 44 (02) : 371 - 376
[12] LEARNING OF IMITATION AND LEARNING THROUGH IMITATION IN WHITE RAT
HARUKI, Y
TSUZUKI, T
ANNUAL OF ANIMAL PSYCHOLOGY, 1967, 17 (02): : 57 - &
[13] Adversarial Imitation Learning from State-only Demonstrations
Torabi, Faraz
Warnell, Garrett
Stone, Peter
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2229 - 2231
[14] Adversarial Imitation Learning from Video using a State Observer
Karnan, Haresh
Torabi, Faraz
Warnell, Garrett
Stone, Peter
2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 2452 - 2458
[15] Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning
Li, Jiahui
Kuang, Kun
Wang, Baoxiang
Liu, Furui
Chen, Long
Fan, Changjie
Wu, Fei
Xiao, Jun
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[16] Deconfounded Multimodal Learning for Spatio-temporal Video Grounding
Wang, Jiawei
Ma, Zhanchang
Cao, Da
Le, Yuquan
Xiao, Junbin
Chua, Tat-Seng
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7521 - 7529
[17] Learning by imitation
Basçi, E
JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 1999, 23 (9-10): : 1569 - 1585
[18] Comments on "An iterative learning controller with initial state learning"
Lucibello, P
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2002, 47 (04) : 703 - 704
[19] Adversarial Imitation Learning between Agents with Different Numbers of State Dimensions
Yoshida, Taketo
Kuniyoshi, Yasuo
2019 IEEE SECOND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE), 2019, : 179 - 186
[20] Addressing Limitations of State-Aware Imitation Learning for Autonomous Driving
Cultrera, Luca
Becattini, Federico
Seidenari, Lorenzo
Pala, Pietro
Del Bimbo, Alberto
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 2946 - 2955

← 1 2 3 4 5 →