X-Armed Bandits: Optimizing Quantiles, CVaR and Other Risks

被引：0

作者：

Torossian, Leonard ^{[1
,2
]}

Garivier, Aurelien ^{[3
]}

Picheny, Victor ^{[4
]}

机构：

[1] Univ Toulouse, INRA, Toulouse, France

[2] Inst Math Toulouse, Toulouse, France

[3] Univ Lyon, ENS Lyon, Lyon, France

[4] PROWLER Io, 72 Hills Rd, Cambridge, England

来源：

ASIAN CONFERENCE ON MACHINE LEARNING, VOL 101 | 2019年 / 101卷

关键词：

Optimistic optimization; Risk-averse solutions; Quantile optimization; CVaR optimization;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose and analyze StoROO, an algorithm for risk optimization on stochastic blackbox functions derived from StoOO. Motivated by risk-averse decision making fields like agriculture, medicine, biology or finance, we do not focus on the mean payoff but on generic functionals of the return distribution. We provide a generic regret analysis of StoROO and illustrate its applicability with two examples: the optimization of quantiles and CVaR. Inspired by the bandit literature and black-box mean optimizers, StoROO relies on the possibility to construct confidence intervals for the targeted functional based on randomsize samples. We detail their construction in the case of quantiles, providing tight bounds based on Kullback-Leibler divergence. We finally present numerical experiments that show a dramatic impact of tight bounds for the optimization of quantiles and CVaR.

引用

页码：268 / 283

页数：16

共 26 条

[1] Coherent measures of risk
Artzner, P
Delbaen, F
Eber, JM
Heath, D
[J]. MATHEMATICAL FINANCE, 1999, 9 (03) : 203 - 228
[2] Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
Audibert, Jean-Yves
Munos, Remi
Szepesvari, Csaba
[J]. THEORETICAL COMPUTER SCIENCE, 2009, 410 (19) : 1876 - 1902
[3] Risk management with expectiles
Bellini, Fabio
Di Bernardino, Elena
[J]. EUROPEAN JOURNAL OF FINANCE, 2017, 23 (06) : 487 - 506
[4] An old-new concept of convex risk measures: The optimized certainty equivalent
Ben-Tal, Aharon
Teboulle, Marc
[J]. MATHEMATICAL FINANCE, 2007, 17 (03) : 449 - 476
[5] Bouttier Clement, 2017, Optimisation globale sous incertitudes: algorithmes stochastiques et bandits continus avec application a la planification de trajectoires d'avions
[6] Large deviations bounds for estimating conditional value-at-risk
Brown, David B.
[J]. OPERATIONS RESEARCH LETTERS, 2007, 35 (06) : 722 - 730
[7] Bubeck S, 2011, J MACH LEARN RES, V12, P1655
[8] David Y., 2016, JOINT EUR C MACH LEA, P556, DOI DOI 10.1007/978-3-319-46128-1_35
[9] Galichet N., 2013, AS C MACH LEARN, P245
[10] Garivier A., 2011, P 24 ANN C LEARN THE, P359

← 1 2 3 →