共 50 条
- [11] Preference-based reinforcement learning: a formal framework and a policy iteration algorithm [J]. Machine Learning, 2012, 89 : 123 - 156
- [12] Robust Control of An Inverted Pendulum System Based on Policy Iteration in Reinforcement Learning [J]. APPLIED SCIENCES-BASEL, 2023, 13 (24):
- [17] Robotic Depalletizing via Reinforcement Learning of a Pushing Policy [J]. SUPPLY CHAINS, PT I, ICSC 2024, 2025, 2110 : 105 - 117
- [18] A Reinforcement Learning Algorithm Based on Policy Iteration for Average Reward: Empirical Results with Yield Management and Convergence Analysis [J]. Machine Learning, 2004, 55 : 5 - 29
- [19] Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning [J]. Performance Evaluation Review, 2023, 51 (01): : 83 - 84