共 87 条
- [1] Boutsidis C(2014)Near-optimal column-based matrix reconstruction SIAM J. Comput. 43 687-717
- [2] Drineas P(1991)An analysis of stochastic shortest path problems Math. OR 16 580-595
- [3] Magdon-Ismail M(2009)Projected equation methods for approximate solution of large linear systems J. Comput. Appl. Math. 227 27-50
- [4] Bertsekas DP(2012)Q-learning and enhanced policy iteration in discounted dynamic programming Math. OR 37 66-94
- [5] Tsitsiklis JN(1975)On the method of multipliers for convex programming IEEE Trans. Auton. Control 20 385-388
- [6] Bertsekas DP(2011)Temporal difference methods for general projected equations IEEE Trans. Autom. Control 56 2128-2139
- [7] Yu H(2011)Approximate policy iteration: a survey and some new methods J. Control Theory Appl. 9 310-335
- [8] Bertsekas DP(2002)Technical update: least-squares temporal difference learning Mach. Learn. 49 1-15
- [9] Yu H(1996)Linear least-squares algorithms for temporal difference learning Mach. Learn. 22 33-57
- [10] Bertsekas DP(2009)A note on the behavior of the randomized Kaczmarz algorithm of Strohmer and Vershynin J. Fourier Anal. Appl. 15 431-436