共 50 条
- [2] Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
- [3] Regret Analysis of Policy Gradient Algorithm for Infinite Horizon Average Reward Markov Decision Processes THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 10980 - 10988
- [7] NECESSARY CONDITIONS FOR THE OPTIMALITY EQUATION IN AVERAGE-REWARD MARKOV DECISION-PROCESSES APPLIED MATHEMATICS AND OPTIMIZATION, 1989, 19 (01): : 97 - 112
- [10] RECURSIVE ADAPTIVE-CONTROL OF MARKOV DECISION-PROCESSES WITH THE AVERAGE REWARD CRITERION APPLIED MATHEMATICS AND OPTIMIZATION, 1991, 23 (02): : 193 - 207