共 35 条
- [1] Ajay A, 2022, Arxiv, DOI arXiv:2211.15657
- [2] Anschel Oron, 2017, P MACHINE LEARNING R, V70
- [3] Baisero A, 2021, gym-gridverse: Gridworld domains for fully and partially observable reinforcement learning
- [4] Chebotar Y, 2023, C ROBOT LEARNING, P3909
- [6] Chen LL, 2021, ADV NEUR IN, V34
- [7] Esslinger K, 2022, Arxiv, DOI [arXiv:2206.01078, 10.48550/arXiv.2206.01078]
- [8] Fortunato M, 2019, Arxiv, DOI arXiv:1706.10295
- [9] Fujimoto S, 2021, ADV NEUR IN, V34
- [10] Fujimoto S, 2018, PR MACH LEARN RES, V80