共 10 条
- [1] Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
- [2] Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
- [3] Nearly Minimax Optimal Regret for Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
- [5] Q-Learning Lagrange Policies for Multi-Action Restless Bandits KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 871 - 881
- [7] A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3868 - 3876
- [9] Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming 2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 4545 - 4552