共 20 条
- [1] Agrawal S.(2012)Analysis of Thompson sampling for the multi-armed bandit problem 25th annual Conference on Learning Theory, COLT’12 23 1-39
- [2] Goyal N.(2002)Finite-time analysis of the multiarmed bandit problem Journal of Machine Learning 47 235-256
- [3] Auer P.(2012)Regret analysis of stochastic and nonstochastic multi-armed bandit problems Foundations and Trends in Machine Learning 5 1-122
- [4] Cesa-Bianchi N.(2008)Foundations of mechanism design: A tutorial - Part 2: Advanced Concepts and Results Sadhana- Indian Academy Proceedings in Engineering Sciences 33 121-174
- [5] Fischer P.(1985)Asymptotically efficient adaptive allocation rules Advances in Applied Mathematics 6 4-22
- [6] Bubeck S.(1981)Optimal auction design Mathematics of Operations Research 6 58-73
- [7] Cesa-Bianchi N.(2013)Dynamic pay-per-action mechanisms and applications to online advertising Operations Research 61 98-111
- [8] Garg Y. N. D.(1952)Some aspects of the sequential design of experiments Bull. Amer. Math. Soc. 58 527-535
- [9] Gujar S.(2012)Truthful multi-armed bandit mechanisms for multi-slot sponsored search auctions Current Science 103 1064-1077
- [10] Lai T. L.(1933)On the likelihood that one unknown probability exceeds another in view of the evidence of two samples Biometrika 25 285-294