共 31 条
[1]
Alegre L.N., 2019, Sumo-rl
[2]
Alegre L. N., 2021, P 20 INT C AUT AG MU, P97
[3]
[Anonymous], 2018, DEEP REINFORCEMENT L
[6]
Bernstein D. S., 2000, P 16 C UNCERTAINTY A, P32
[8]
A comprehensive survey of multiagent reinforcement learning
[J].
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS,
2008, 38 (02)
:156-172
[9]
Choi SPM, 2000, P 12 ADV NEUR INF PR, P994
[10]
Christopher John Cornish Hellaby Watkins, 1989, Learning from delayed rewards