共 7 条
[3]
LAURENT G, 2002, P 2002 IEEE RSJ INT
[4]
LAURENT G, 2001, P 2001 IEEE RSJ INT
[5]
Puterman M.L., 2008, Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics
[6]
Sutton R. S., 1998, Reinforcement Learning: An Introduction, V22447
[7]
WATKINS CJCH, 1992, MACH LEARN, V8, P279, DOI 10.1007/BF00992698