共 36 条
- [1] Albus J. S., 1975, Transactions of the ASME. Series G, Journal of Dynamic Systems, Measurement and Control, V97, P220, DOI 10.1115/1.3426922
- [2] Baird L. C., 1993, WLTR931147 WRIGHT PA
- [3] A COUNTEREXAMPLE TO TEMPORAL DIFFERENCES LEARNING [J]. NEURAL COMPUTATION, 1995, 7 (02) : 270 - 279
- [4] Billingsley P., 1999, Convergence of Probability Measures, V2nd ed., DOI DOI 10.1002/9780470316962
- [5] Ernst D, 2005, J MACH LEARN RES, V6, P503
- [6] Fairbank M., 2012, P IEEE INT JOINT C N
- [7] Gaskett C, 1999, LECT NOTES ARTIF INT, V1747, P417
- [8] Gordon G., 1996, CHATTERING SARSA LAM
- [9] Gyorfi L., 1985, WILEY SERIES PROBABI
- [10] Hansen B. E., 2008, ECONOMETRIC THEORY