共 42 条
[1]
Abate A, 2007, LECT NOTES COMPUT SC, V4416, P4
[3]
[Anonymous], 2007, DYNAMIC PROGRAMMING
[4]
[Anonymous], 2013, The Cross-Entropy Method
[5]
[Anonymous], 2010, Algorithms for Reinforcement Learning
[6]
Antos A., 2008, Advances in Neural Information Processing Systems, P9
[7]
Bagnell JA, 2001, IEEE INT CONF ROBOT, P1615, DOI 10.1109/ROBOT.2001.932842
[8]
Approximate policy iteration: A survey and some new methods
[J].
Journal of Control Theory and Applications,
2011, 9 (3)
:310-335
[9]
BERTSEKAS D. P., 1996, Stochastic optimal control: the discrete-time case
[10]
Busoniu L, 2010, AUTOM CONTROL ENG SE, P1, DOI 10.1201/9781439821091-f