共 163 条
[32]
Francois-Lavet V., 2016, P EUR WORKSH REINF L
[33]
An Introduction to Deep Reinforcement Learning
[J].
FOUNDATIONS AND TRENDS IN MACHINE LEARNING,
2018, 11 (3-4)
:219-354
[34]
Fujimoto S., 2018, PROC INT C MACH LEAR
[35]
Gamage H., 2017, PROC IEEE 86 VEH TEC, P1
[36]
Gu S., 2016, Q-Prop: Sample-Efficient Policy Gradient with an Off-Policy Critic
[37]
Haarnoja T, 2018, P INT C MACH LEARN
[38]
Hausknecht M., 2015, P AAAI C ART INT
[40]
Hochreiter S, 2001, A Field Guide to Dynamical Recurrent Networks, P237, DOI DOI 10.1109/9780470544037.CH14