共 50 条
- [1] Near-optimal Reinforcement Learning in Factored MDPs ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
- [2] Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback INTERNATIONAL CONFERENCE ON MACHINE LEARNING, 2024, 235
- [3] Near-Optimal Interdiction of Factored MDPs CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
- [4] Instance-Dependent Near-Optimal Policy Identification in Linear MDPs via Online Experiment Design ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [5] Dynamic Regret of Adversarial Linear Mixture MDPs ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [7] Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [10] Near-Optimal Sample Complexity Bounds for Constrained MDPs ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,