共 36 条
[1]
Achiam J., 2021, Exploration and safety in deep reinforcement learning
[2]
Altman E., 1999, Constrained Markov Decision Processes, V7, DOI 10.1201/9781315140223
[3]
Carr S, 2023, AAAI CONF ARTIF INTE, P14748
[4]
Off-Policy Actor-critic for Recommender Systems
[J].
PROCEEDINGS OF THE 16TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2022,
2022,
:338-349
[5]
Cheng CA, 2022, PR MACH LEARN RES
[6]
Chow Y, 2018, J MACH LEARN RES, V18
[7]
Dai JT, 2023, AAAI CONF ARTIF INTE, P7288
[9]
Fujimoto S, 2021, ADV NEUR IN, V34
[10]
García J, 2015, J MACH LEARN RES, V16, P1437