共 38 条
[1]
[Anonymous], 1993, P ADV NEUR INF PROC
[2]
[Anonymous], WLTR931146 WRIGHT PA
[3]
Asada M, 1996, IROS 96 - PROCEEDINGS OF THE 1996 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS - ROBOTIC INTELLIGENCE INTERACTING WITH DYNAMIC WORLDS, VOLS 1-3, P1502, DOI 10.1109/IROS.1996.569012
[4]
Atkeson C. G., 1994, Advances in neural information processing systems, P663
[5]
Baird L., 1995, MACHINE LEARNING
[6]
NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS
[J].
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS,
1983, 13 (05)
:834-846
[7]
Bertsekas DP, 2012, DYNAMIC PROGRAMMING, V2
[8]
Bradtke S. J., 1995, Advances in Neural Information Processing Systems 7, P393
[9]
Christopher JohnCornish Hella by Watkins., 1989, Learning from delayed rewards
[10]
Crites RH, 1996, ADV NEUR IN, V8, P1017