共 13 条
[1]
Azuma K., 1967, TOHOKU MATH J, V19, P357, DOI DOI 10.2748/TMJ/1178243286
[2]
Beleznay F., 1999, TR9902 MINDM LTD
[4]
DIMITRI P, 1996, NEURODYNAMIC PROGRAM
[5]
JAAKKOLA T, 1994, NEURAL COMPUTATION, V6
[6]
Kearns M, 1999, ADV NEUR IN, V11, P996
[7]
LITTMAN ML, 1996, P 13 INT C MACH LEAR, P310
[8]
Puterman M.L., 2008, Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics
[9]
Sutton R. S., 1998, Reinforcement Learning: An Introduction, V22447
[10]
Szepesvari C, 1998, ADV NEUR IN, V10, P1064