共 38 条
[1]
[Anonymous], 1993, MACHINE LEARNING MET
[2]
[Anonymous], 2010, P 6 INT WIRELESS COM, DOI DOI 10.1145/1815396.1815448
[3]
Bellman R. E., 1957, Dynamic programming. Princeton landmarks in mathematics
[4]
A MINIMUM-TIME TRAJECTORY PLANNING METHOD FOR 2 ROBOTS
[J].
IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION,
1992, 8 (03)
:414-418
[5]
Busoniu L, 2010, AUTOM CONTROL ENG SE, P1, DOI 10.1201/9781439821091-f
[7]
Hybrid Q-learning Algorithm About Cooperation in MAS
[J].
CCDC 2009: 21ST CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-6, PROCEEDINGS,
2009,
:3943-3947
[8]
Chen Z, 2011, IEEE SOUTHEASTCON, P409, DOI 10.1109/SECON.2011.5752976
[9]
A production technique for a Q-table with an influence map for speeding up Q-learning
[J].
2007 INTERNATIONAL CONFERENCE ON INTELLIGENT PERVASIVE COMPUTING, PROCEEDINGS,
2007,
:72-+
[10]
The knowledge gradient policy for offline learning with independent normal rewards
[J].
2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING,
2007,
:143-+