共 50 条
- [22] Generalized Policy Iteration-based Reinforcement Learning Algorithm for Optimal Control of Unknown Discrete-time Systems [J]. PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3650 - 3655
- [24] A Survey of Preference-Based Online Learning with Bandit Algorithms [J]. ALGORITHMIC LEARNING THEORY (ALT 2014), 2014, 8776 : 18 - 39
- [28] A Policy-Based Reinforcement Learning Algorithm for Intelligent Train Control [J]. Tiedao Xuebao/Journal of the China Railway Society, 2020, 42 (01): : 69 - 75
- [30] Reinforcement Learning Control of a Real Mobile Robot Using Approximate Policy Iteration [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2009, PT 3, PROCEEDINGS, 2009, 5553 : 278 - 288