共 50 条
- [41] A PAC Learning Algorithm for LTL and Omega-Regular Objectives in MDPs THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21510 - 21517
- [42] Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
- [43] Reinforcement learning for MDPs using temporal difference schemes PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 1997, : 577 - 583
- [45] Exploiting Additive Structure in Factored MDPs for Reinforcement Learning RECENT ADVANCES IN REINFORCEMENT LEARNING, 2008, 5323 : 15 - 26
- [50] Active Coverage for PAC Reinforcement Learning THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195