共 21 条
[2]
Bertsekas D. P., 1996, NEURO DYNAMIC PROGRA, DOI [10.1109/MCSE.1998.683749, DOI 10.1109/MCSE.1998.683749]
[6]
Dietterich T.G., 1996, ACM Computing Surveys (CSUR), V28, P3, DOI [DOI 10.1145/242224.242229, 10.1145/242224.242229]
[8]
Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data
[J].
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS,
2011, 41 (01)
:14-25
[9]
Ng A., 2004, P INT S EXPT ROBOTIC, P1