共 35 条
- [1] Abels Axel, 2019, INT C MACHINE LEARNI, V97, P11
- [3] Achiam J, 2017, PR MACH LEARN RES, V70
- [4] Altman E., 1999, Constrained Markov Decision Processes, V7
- [5] [Anonymous], 2018, ARXIV180511074
- [6] [Anonymous], 1994, Adaptive Control
- [7] Bai Q, 2021, ARXIV200305555
- [8] Bertsekas DP, 2005, DYNAMIC PROGRAMMING
- [10] Brantley K., 2020, ARXIV200605051