共 29 条
[1]
Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
[J].
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS,
2008, 38 (04)
:943-949
[2]
[Anonymous], 2008, REINFORCEMENT LEARNI
[3]
[Anonymous], P INT C NEUR NETW
[4]
[Anonymous], 1962, MATH THEORY OPTIMAL
[5]
Baird L., 1995, MACHINE LEARNING P, P30, DOI [DOI 10.1016/B978-1-55860-377-6.50013-X, 10.1.1.48.3256.1,5.1]
[6]
TEMPORAL DIFFERENCE-METHODS AND MARKOV-MODELS
[J].
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS,
1993, 23 (02)
:357-365
[7]
NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS
[J].
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS,
1983, 13 (05)
:834-846
[8]
Bellman R. E., 1957, Dynamic programming. Princeton landmarks in mathematics
[10]
Fairbank M., 2011, LOCAL OPTIMALITY REI