共 20 条
- [1] Approximate policy iteration:a survey and somenew methods[J]. Dimitri P.BERTSEKAS.Journal of Control Theory and Applications. 2011(03)
- [2] Deep learning in neural networks: An overview[J] . Jürgen Schmidhuber.Neural Networks . 2014
- [4] A fast learning algorithm for deep belief nets [J]. NEURAL COMPUTATION, 2006, 18 (07) : 1527 - 1554
- [5] Programming backgammon using self-teaching neural nets[J] . Gerald Tesauro.Artificial Intelligence . 2002 (1)
- [6] Rollout algorithms for stochastic scheduling problems [J]. JOURNAL OF HEURISTICS, 1999, 5 (01) : 89 - 108
- [7] Rollout Algorithms for Combinatorial Optimization[J] . Dimitri P. Bertsekas,John N. Tsitsiklis,Cynara Wu.Journal of Heuristics . 1997 (3)
- [8] Feature-Based Methods for Large Scale Dynamic Programming[J] . John N. Tsitsiklis,Benjamin Van Roy.Machine Learning . 1996 (1)
- [9] A COUNTEREXAMPLE TO TEMPORAL DIFFERENCES LEARNING [J]. NEURAL COMPUTATION, 1995, 7 (02) : 270 - 279
- [10] TEMPORAL DIFFERENCE LEARNING AND TD-GAMMON [J]. COMMUNICATIONS OF THE ACM, 1995, 38 (03) : 58 - 68