共 32 条
[1]
[Anonymous], 2018, INT C MACH LEARN
[2]
[Anonymous], 2016, Openai gym
[3]
Basar T., 1998, Dynamic Noncooperative Game Theory
[4]
Borkar V.S., 2009, Stochastic approximation: a dynamical systems viewpoint, V48
[5]
Chan S. C., 2019, INT C LEARN REPR
[6]
Fiez T., 2020, PR MACH LEARN RES, P3133
[7]
Flokas L., 2021, ARXIV210105248
[8]
Foerster J, 2018, PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), P122
[9]
A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients
[J].
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS,
2012, 42 (06)
:1291-1307
[10]
Hernandez-Leal P., 2017, AUTONOMOUS AGENTS MU