共 37 条
- [1] Watkins C.J.C.H., Dayan P., Q-learning, Mach. Learn., 8, 3-4, pp. 279-292, (1992)
- [2] Kokar M.M., Reveliotis S.A., Reinforcement learning: Architectures and algorithms, Int. J. Intell. Syst., 8, 8, pp. 875-894, (1993)
- [3] Martinez-Gil Lozano F.M., Fernandez F., Marl-ped: A multi-agent reinforcement learning based framework to simulate pedestrian groups, Simul. Modelling Pract. Theory, 47, pp. 259-275, (2014)
- [4] Boubertakh H., Tadjine M., Glorennec P.-Y., A new mobile robot navigation method using fuzzy logic and a modified q-learning algorithm, J. Intell. Fuzzy Syst., 21, 1-2, pp. 113-119, (2010)
- [5] Van Hasselt H., Reinforcement Learning in Continuous State and Action Spaces, in Reinforcement Learning, pp. 207-251, (2012)
- [6] Fernandez F., Borrajo D., Two steps reinforcement learning, Int. J. Intell. Syst., 23, 2, pp. 213-245, (2008)
- [7] Sutton R.S., Barto A.G., Reinforcement Learning: An Introduction, 1, (1998)
- [8] Da Motta Salles Barreto A., Anderson C.W., Restricted gradient-descent algorithm for value-function approximation in reinforcement learning, Artif. Intell., 172, 4, pp. 454-482, (2008)
- [9] Russell S.J., Norvig P., Canny J.F., Malik J.M., Edwards D.D., Artificial Intelligence: A Modern Approach, 2, (2003)
- [10] Parr R., Li L., Taylor G., Wakefield C.P., Littman M.L., An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning, Proc. 25th Int. Conf. Machine Learning, ACM, pp. 752-759, (2008)