共 70 条
[1]
Abel David, 2018, PMLR, P20
[2]
Experience Replay for Real-Time Reinforcement Learning Control
[J].
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS,
2012, 42 (02)
:201-212
[3]
Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
[J].
2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING,
2007,
:38-+
[4]
andez Fern, 2006, P 5 INT JOINT C AUTO, P720
[5]
Andrychowicz M., 2017, P ADV NEUR INF PROC
[6]
[Anonymous], 2013, Policy shaping: Integrating human feedback with reinforcement learning