共 25 条
[1]
Badia AP, 2020, Arxiv, DOI [arXiv:2002.06038, 10.48550/arXiv.2002.06038]
[2]
Baldassarre G, 2019, Arxiv, DOI arXiv:1912.13263
[4]
Bellemare MG, 2016, ADV NEUR IN, V29
[5]
Burda Y, 2018, Arxiv, DOI [arXiv:1810.12894, 10.48550/arXiv.1810.12894]
[6]
Burda Y, 2018, Arxiv, DOI arXiv:1808.04355
[7]
Stadie BC, 2015, Arxiv, DOI arXiv:1507.00814
[8]
Clark Jack, 2016, Faulty reward functions in the wild
[9]
Fu J., 2017, Adv. Neural Inf. Process. Syst., P30
[10]
Gidaris S, 2018, Arxiv, DOI arXiv:1803.07728