共 32 条
- [1] BELLMAN R, 1957, J MATH MECH, P6
- [2] Technical update: Least-squares temporal difference learning [J]. MACHINE LEARNING, 2002, 49 (2-3) : 233 - 246
- [3] THE CONVERGENCE OF TD(LAMBDA) FOR GENERAL LAMBDA [J]. MACHINE LEARNING, 1992, 8 (3-4) : 341 - 362
- [4] DAYAN P, 1994, MACH LEARN, V14, P295
- [5] Gabel T., 2006, KI Z, V20, P18
- [6] Howard R. A., 1960, Dynamic programming and Markov processes
- [7] JOHN NT, 1994, MACH LEARN, V16, P185
- [8] Kleiner A., 2002, P INT ROB S 02 FUK J, P119
- [9] LENG J, 2006, LECT NOTES ARTIF INT, V4692, P572
- [10] Leng JS, 2006, LECT NOTES ARTIF INT, V4252, P472