共 19 条
- [1] Bertsekas D., 2012, DYNAMIC PROGRAMMING, V1
- [2] Bertsekas D., 1996, NEURO DYNAMIC PROGRA
- [3] Goyal R, 2023, Arxiv, DOI arXiv:2107.08086
- [5] Sim-to-Real via Sim-to-Sim: Data-efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12819 - 12629
- [6] Kalashnikov D., 2018, ARXIV180610293, P651
- [7] Khadka Shauharda, 2019, International conference on machine learning, V97, P3341
- [8] Levine S, 2016, J MACH LEARN RES, V17
- [9] Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 2811 - 2817
- [10] Human-level control through deep reinforcement learning [J]. NATURE, 2015, 518 (7540) : 529 - 533