共 30 条
[1]
Baranes A(2013)Active learning of inverse models with intrinsically motivated goal exploration in robots Robotics and Autonomous Systems 61 49-73
[2]
Oudeyer PY(2011)Abandoning objectives: Evolution through the search for novelty alone Evolutionary Computation 19 189-223
[3]
Lehman J(2016)End-to-end training of deep visuomotor policies The Journal of Machine Learning Research 17 1334-1373
[4]
Stanley KO(2015)Human-level control through deep reinforcement learning Nature 518 529-287
[5]
Levine S(1999)Policy invariance under reward transformations: Theory and application to reward shaping Proceedings of the International Conference on Machine Learning 99 278-2819
[6]
Finn C(2014)Changing the environment based on empowerment as intrinsic motivation Entropy 16 2789-1897
[7]
Darrell T(2015)Trust region policy optimization Proceedings of the International Conference on International Conference on Machine Learning 37 1889-44
[8]
Abbeel P(1988)Learning to predict by the methods of temporal differences Machine Learning 3 9-undefined
[9]
Mnih V(undefined)undefined undefined undefined undefined-undefined
[10]
Kavukcuoglu K(undefined)undefined undefined undefined undefined-undefined