共 55 条
[1]
Agarwal Alekh, 2020, P MACHINE LEARNING R, V125
[2]
[Anonymous], 2011, Technical report
[3]
[Anonymous], 2018, ADV NEURAL INFORM PR
[5]
Error bounds for constant step-size Q-learning
[J].
SYSTEMS & CONTROL LETTERS,
2012, 61 (12)
:1203-1208
[7]
Bertsekas D. P., 2017, DYNAMIC PROGRAMMING, V4th
[8]
Bhandari J., 2018, C LEARN THEOR COLT, P1691
[10]
CAI Z., 2019, Advances in Neural Information Processing Systems, V32, P11312