共 50 条
- [41] Off-policy and on-policy reinforcement learning with the Tsetlin machine Applied Intelligence, 2023, 53 : 8596 - 8613
- [43] Debiased Off-Policy Evaluation for Recommendation Systems 15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021), 2021, : 372 - 379
- [44] Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
- [46] On the asymptotic behavior of a constant stepsize temporal-difference learning algorithm COMPUTATIONAL LEARNING THEORY, 1999, 1572 : 126 - 137
- [47] Implementing Temporal-Difference Learning with the Scaled Conjugate Gradient Algorithm Neural Processing Letters, 2005, 22 : 361 - 375
- [49] Temporal-Difference Learning An Online Support Vector Regression Approach ICIMCO 2015 PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL. 1, 2015, : 318 - 323