共 49 条
[31]
Polyak Boris Teodorovich, 1963, USSR Computational Mathematics and Mathematical Physics, V3, P14
[32]
A Tour of Reinforcement Learning: The View from Continuous Control
[J].
ANNUAL REVIEW OF CONTROL, ROBOTICS, AND AUTONOMOUS SYSTEMS, VOL 2,
2019, 2
:253-279
[33]
Schulman J, 2018, Arxiv, DOI arXiv:1506.02438
[34]
Schulman J, 2017, Arxiv, DOI [arXiv:1707.06347, 10.48550/arXiv.1707.06347]
[35]
Schulman J, 2015, PR MACH LEARN RES, V37, P1889
[36]
Shani L, 2019, Arxiv, DOI arXiv:1909.02769
[38]
Skogestad S., 2007, Multivariable Feedback Control: Analysis and Design, V2
[39]
Sutton RS, 2000, ADV NEUR IN, V12, P1057
[40]
Tu SP, 2019, Arxiv, DOI arXiv:1812.03565