共 48 条
- [1] Sutton R.S., Barto A.G., Et al., Reinforcement Learning: An Introduction, (1998)
- [2] Kober J., Bagnell J.A., Peters J., Reinforcement learning in robotics: a survey, Int. J. Rob. Res., 32, 11, pp. 1238-1274, (2013)
- [3] Mnih V., Kavukcuoglu K., Silver D., Rusu A.A., Veness J., Bellemare M.G., Graves A., Riedmiller M., Fidjeland A.K., Ostrovski G., Et al., Human-level control through deep reinforcement learning, Nature, 518, 7540, pp. 529-533, (2015)
- [4] Silver D., Huang A., Maddison C.J., Guez A., Sifre L., Van Den Driessche G., Schrittwieser J., Antonoglou I., Panneershelvam V., Lanctot M., Et al., Mastering the game of go with deep neural networks and tree search, Nature, 529, 7587, pp. 484-489, (2016)
- [5] Bahdanau D., Brakel P., Xu K., Goyal A., Lowe R., Pineau J., Courville A., Bengio Y., An actor-critic algorithm for sequence prediction, Proceedings of the International Conference on Learning Representations, (2016)
- [6] Watkins C.J.C.H., Learning from delayed rewards, (1989)
- [7] Jaakkola T., Jordan M.I., Singh S.P., Convergence of stochastic iterative dynamic programming algorithms, Proceedings of the Advances in Neural Information Processing Systems, pp. 703-710, (1994)
- [8] Li B., Yang Q., Xue X., Transfer learning for collaborative filtering via a rating-matrix generative model, Proceedings of the 26th annual international conference on machine learning, pp. 617-624, (2009)
- [9] Pan S.J., Yang Q., A survey on transfer learning, IEEE Trans. Knowl. Data Eng., 22, 10, pp. 1345-1359, (2010)
- [10] Oquab M., Bottou L., Laptev I., Sivic J., Learning and transferring mid-level image representations using convolutional neural networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1717-1724, (2014)