共 23 条
[1]
[Anonymous], 2016, International Conference on Machine Learning, DOI DOI 10.48550/ARXIV.1602.01783
[2]
Beattie Charles, 2016, Deepmind lab
[3]
Bruce J., 2017, One-shot reinforcement learning for robot navigation with interactive replay
[4]
Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[5]
A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients
[J].
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS,
2012, 42 (06)
:1291-1307
[6]
Gurvits L., 1994, PREPRINT
[7]
He K., 2016, CVPR, DOI [10.1109/CVPR.2016.90, DOI 10.1109/CVPR.2016.90]
[8]
Herrasti Alvaro, 2017, AI2-THOR: An Interactive 3D Environment for Visual AI
[9]
Jaderberg M., 2016, REINFORCEMENT LEARNI
[10]
Lange S., 2010, 2010 INT JOINT C NEU, P1, DOI DOI 10.1109/IJCNN.2010.5596468