共 62 条
[1]
Abdolmaleki A., 2018, Maximum a posteriori policy optimisation
[2]
Andrychowicz M, 2016, ADV NEUR IN, V29
[3]
[Anonymous], 2009, P 26 ANN INT C MACH
[4]
[Anonymous], 2002, Advances in neural information processing systems
[5]
Anschel O, 2017, 34 INT C MACHINE LEA, V70
[7]
Babaeizadeh M., 2017, Reinforcement learning through asynchronous advantage actor-critic on a gpu
[9]
Active Object Localization with Deep Reinforcement Learning
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2488-2496
[10]
Degris T, 2012, P AMER CONTR CONF, P2177