共 113 条
[11]
Bengio Y., 2006, P ADV NEUR INF PROC, V19, P153
[13]
Approximate policy iteration: A survey and some new methods
[J].
Journal of Control Theory and Applications,
2011, 9 (3)
:310-335
[14]
Bertsekas D. P., 2012, LIDSP2884 LAB INF DE
[15]
Bertsekas D. P., 1996, NEURO DYNAMIC PROGRA
[16]
Bohmer W, 2013, J MACH LEARN RES, V14, P2067
[17]
Borkar V. S., 2008, STOCHASTIC APPROXIMA
[19]
Bradtke SJ, 1996, MACH LEARN, V22, P33, DOI 10.1007/BF00114723
[20]
Busoniu L, 2010, AUTOM CONTROL ENG SE, P1, DOI 10.1201/9781439821091-f