共 16 条
- [1] Finite-time analysis of the multiarmed bandit problem [J]. MACHINE LEARNING, 2002, 47 (2-3) : 235 - 256
- [5] Cazenave T, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P754
- [7] Coulom R, 2007, LECT NOTES COMPUT SC, V4630, P72
- [8] Gelly S., 2007, P 24 INT C MACH LEAR, P273, DOI [10.1145/1273496.1273531, DOI 10.1145/1273496.1273531]
- [9] Genesereth M., 2014, SYNTHESIS LECT ARTIF, V8, P1, DOI DOI 10.2200/S00564ED1V01Y201311AIM024
- [10] Bandit based Monte-Carlo planning [J]. MACHINE LEARNING: ECML 2006, PROCEEDINGS, 2006, 4212 : 282 - 293