共 47 条
[1]
[Anonymous], 2009, P 26 ANN INT C MACH
[2]
Baird L., 1995, In Proceedings of the Twelfth International Conference on Machine Learning, P30
[4]
Brandfonbrener D., 2020, INT C LEARN REPR
[5]
Cai Q., 2019, Advances in Neural Information Processing Systems, P11312
[6]
Dann C, 2014, J MACH LEARN RES, V15, P809
[7]
De Asis K., 2020, P 34 AAAI C ART INT, P9337
[8]
A Convergent Off-Policy Temporal Difference Algorithm
[J].
ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE,
2020, 325
:1103-1110
[9]
Ghiassian S., 2020, INT C MACHINE LEARNI, P3524
[10]
Ghiassian S., 2018, ARXIV PREPRINT ARXIV