机构:
Microsoft Res, 9 Lavelle Rd, Bengaluru 560001, Karnataka, IndiaMicrosoft Res, 9 Lavelle Rd, Bengaluru 560001, Karnataka, India
Agrawal, Shipra
[1
]
Goyal, Navin
论文数: 0引用数: 0
h-index: 0
机构:
Microsoft Res, 9 Lavelle Rd, Bengaluru 560001, Karnataka, India
Columbia Univ, Dept Ind Engn & Operat Res, 500 West 120th St,Mudd 423, New York, NY 10027 USAMicrosoft Res, 9 Lavelle Rd, Bengaluru 560001, Karnataka, India
Goyal, Navin
[1
,2
]
机构:
[1] Microsoft Res, 9 Lavelle Rd, Bengaluru 560001, Karnataka, India
[2] Columbia Univ, Dept Ind Engn & Operat Res, 500 West 120th St,Mudd 423, New York, NY 10027 USA
Thompson Sampling (TS) is one of the oldest heuristics for multiarmed bandit problems. It is a randomized algorithm based on Bayesian ideas and has recently generated significant interest after several studies demonstrated that it has favorable empirical performance compared to the state-of-the-art methods. In this article, a novel and almost tight martingale-based regret analysis for Thompson Sampling is presented. Our technique simultaneously yields both problem-dependent and problem-independent bounds: (1) the first near-optimal problem-independent bound of O(root NT ln T) on the expected regret and (2) the optimal problem-dependent bound of (1 + epsilon) Sigma i ln T/d(mu(i), mu(1)) + O(N/epsilon(2)) on the expected regret (this bound was first proven by Kaufmann et al. (2012b)). Our technique is conceptually simple and easily extends to distributions other than the Beta distribution used in the original TS algorithm. For the version of TS that uses Gaussian priors, we prove a problem-independent bound of O(root NT ln N) on the expected regret and show the optimality of this bound by providing a matching lower bound. This is the first lower bound on the performance of a natural version of Thompson Sampling that is away from the general lower bound of Omega(root NT) for the multiarmed bandit problem.
机构:
Cent Supelec, F-91190 Gif Sur Yvette, FranceCent Supelec, F-91190 Gif Sur Yvette, France
Combes, Richard
Ok, Jungseul
论文数: 0引用数: 0
h-index: 0
机构:
KTH Royal Inst Technol, S-1428 Stockholm, Sweden
Korea Adv Inst Sci & Technol, Dept Elect Engn, Daejeon 34141, South KoreaCent Supelec, F-91190 Gif Sur Yvette, France
Ok, Jungseul
Proutiere, Alexandre
论文数: 0引用数: 0
h-index: 0
机构:
KTH Royal Inst Technol, S-1428 Stockholm, Sweden
INRIA Microsoft Joint Res Lab, F-78153 Le Chesnay, FranceCent Supelec, F-91190 Gif Sur Yvette, France
Proutiere, Alexandre
Yun, Donggyu
论文数: 0引用数: 0
h-index: 0
机构:
Korea Adv Inst Sci & Technol, Dept Elect Engn, Daejeon 34141, South KoreaCent Supelec, F-91190 Gif Sur Yvette, France
Yun, Donggyu
Yi, Yung
论文数: 0引用数: 0
h-index: 0
机构:
Korea Adv Inst Sci & Technol, Dept Elect Engn, Daejeon 34141, South KoreaCent Supelec, F-91190 Gif Sur Yvette, France
机构:
Cent Supelec, F-91190 Gif Sur Yvette, FranceCent Supelec, F-91190 Gif Sur Yvette, France
Combes, Richard
Ok, Jungseul
论文数: 0引用数: 0
h-index: 0
机构:
KTH Royal Inst Technol, S-1428 Stockholm, Sweden
Korea Adv Inst Sci & Technol, Dept Elect Engn, Daejeon 34141, South KoreaCent Supelec, F-91190 Gif Sur Yvette, France
Ok, Jungseul
Proutiere, Alexandre
论文数: 0引用数: 0
h-index: 0
机构:
KTH Royal Inst Technol, S-1428 Stockholm, Sweden
INRIA Microsoft Joint Res Lab, F-78153 Le Chesnay, FranceCent Supelec, F-91190 Gif Sur Yvette, France
Proutiere, Alexandre
Yun, Donggyu
论文数: 0引用数: 0
h-index: 0
机构:
Korea Adv Inst Sci & Technol, Dept Elect Engn, Daejeon 34141, South KoreaCent Supelec, F-91190 Gif Sur Yvette, France
Yun, Donggyu
Yi, Yung
论文数: 0引用数: 0
h-index: 0
机构:
Korea Adv Inst Sci & Technol, Dept Elect Engn, Daejeon 34141, South KoreaCent Supelec, F-91190 Gif Sur Yvette, France