共 27 条
- [1] Araabi BN(2007)A study on expertise of agents and its effects on cooperative Q-Learning IEEE Trans Syst Man Cybern, Part B, Cybern 37 398-409
- [2] Mastoureshgh S(1992)Q-Learning Mach Learn 8 279-292
- [3] Ahmadabadi MN(2000)Quad-Q-Learning IEEE Trans Neural Netw 11 279-294
- [4] Watkins C(2008)Improved adaptive–reinforcement learning control for morphing unmanned air vehicles IEEE Trans Syst Man Cybern, Part B, Cybern 38 1014-1020
- [5] Dayan P(2008)Estimating biped gait using spline-based probability distribution function with Q-Learning IEEE Trans Ind Electron 55 1444-1452
- [6] Clausen C(2004)A new Q-Learning algorithm based on the metropolis criterion IEEE Trans Syst Man Cybern, Part B, Cybern 34 2140-2143
- [7] Wechsler H(2008)Ensemble algorithms in reinforcement learning IEEE Trans Syst Man Cybern, Part B, Cybern 38 930-935
- [8] Valasek J(1992)The convergence of TD( Mach Learn 8 341-362
- [9] Doebbler J(1996)) for general Mach Learn 22 251-281
- [10] Tandale MD(2005)Creating advice-taking reinforcement learners IEEE Trans Intell Transp Syst 6 285-293