共 49 条
[1]
[Anonymous], 1994, P ADV NEUR INF PROC
[3]
Choi SPM, 1996, ADV NEUR IN, V8, P945
[6]
Using feedback in collaborative reinforcement learning to adaptively optimize MANET routing
[J].
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS,
2005, 35 (03)
:360-372
[7]
Duan Y., 2016, ARXIV161102779
[8]
Finn C, 2017, PR MACH LEARN RES, V70