共 50 条
- [33] Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
- [34] Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
- [35] Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits 2019 IEEE 60TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2019), 2019, : 126 - 146
- [36] Minimax Regret Bounds for Reinforcement Learning INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
- [37] Variational Regret Bounds for Reinforcement Learning 35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 81 - 90
- [38] Regret of Queueing Bandits ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
- [40] Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 13076 - 13084