共 34 条
[7]
Hinton Geoff., 1994, ADV NEURAL INFORM PR, V6
[9]
Bandit based Monte-Carlo planning
[J].
MACHINE LEARNING: ECML 2006, PROCEEDINGS,
2006, 4212
:282-293