共 18 条
[1]
Amin K., Kale S., Tesauro Deepak Turaga G., Budgeted prediction with expert advice, Proceedings of AAAI, (2015)
[2]
Audibert J., Bubeck S., Regret bounds and minimax policies under partial monitoring, The Journal of Machine Learning Research, 11, pp. 2785-2836, (2010)
[3]
Auer P., Cesa-Bianchi N., Freund Y., Schapire R.E., The nonstochastic multiarmed bandit problem, SIAM J. Comput., 32, 1, pp. 48-77, (2002)
[4]
Caron S., Kveton B., Lelarge M., Bhagat S., Leveraging side observations in stochastic bandits, Proceedings of UAI, (2012)
[5]
Cesa-Bianchi N., Freund Y., Haussler D., Helmbold D.P., Schapire R.E., Warmuth M.K., How to use expert advice, Journal of the ACM (JACM), 44, 3, pp. 427-485, (1997)
[6]
Cesa-Bianchi N., Lugosi G., Prediction, Learning, and Games, (2006)
[7]
Combes R., Jiang C., Srikant R., Bandits with budgets: Regret lower bounds and optimal algorithms, Proceedings of ACM SIGMETRICS, (2015)
[8]
Combes R., Proutiere A., Yun D., Ok J., Yi Y., Optimal rate sampling in 802. 11 systems, Proceedings of IEEE INFOCOM, (2014)
[9]
Freund Y., Schapire R.E., A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, 55, 1, pp. 119-139, (1997)
[10]
Garivier A., Cappe O., The KL-UCB algorithm for bounded stochastic bandits and beyond, Proceedings of COLT, (2011)