共 50 条
- [1] Temporal-difference emphasis learning with regularized correction for off-policy evaluation and control Applied Intelligence, 2023, 53 : 20917 - 20937
- [3] Generalized gradient emphasis learning for off-policy evaluation and control with function approximation Neural Computing and Applications, 2023, 35 : 23599 - 23616
- [7] Off-Policy Temporal Difference Learning for Perturbed Markov Decision Processes IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 3488 - 3493
- [8] Distributed Gradient Temporal Difference Off-policy Learning With Eligibility Traces: Weak Convergence IFAC PAPERSONLINE, 2020, 53 (02): : 1563 - 1568
- [10] Off-Policy Reinforcement Learning with Loss Function Weighted by Temporal Difference Error ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT V, 2023, 14090 : 600 - 613