A stochastic multi-armed bandit approach to nonparametric H∞-norm estimation

被引:0
|
作者
Mueller, Matias I. [1 ]
Valenzuela, Patricio E. [1 ]
Proutiere, Alexandre [1 ]
Rojas, Cristian R. [1 ]
机构
[1] KTH Royal Inst Technol, Dept Automat Control, SE-10044 Stockholm, Sweden
基金
瑞典研究理事会;
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the problem of estimating the largest gain of an unknown linear and time-invariant filter, which is also known as the H-infinity norm of the system. By using ideas from the stochastic multi-armed bandit framework, we present a new algorithm that sequentially designs an input signal in order to estimate this quantity by means of input-output data. The algorithm is shown empirically to beat an asymptotically optimal method, known as Thompson Sampling, in the sense of its cumulative regret function. Finally, for a general class of algorithms, a lower bound on the performance of finding the H-infinity norm is derived.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Multi-Armed Bandit for Species Discovery: A Bayesian Nonparametric Approach
    Battiston, Marco
    Favaro, Stefano
    Teh, Yee Whye
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (521) : 455 - 466
  • [2] Adversarial multi-armed bandit approach to stochastic optimization
    Chang, Hyeong Soo
    Fu, Michael C.
    Marcus, Steven I.
    PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 5684 - +
  • [3] Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates
    Yang, YH
    Zhu, D
    ANNALS OF STATISTICS, 2002, 30 (01): : 100 - 121
  • [4] The Multi-Armed Bandit With Stochastic Plays
    Lesage-Landry, Antoine
    Taylor, Joshua A.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (07) : 2280 - 2286
  • [5] THE MULTI-ARMED BANDIT PROBLEM: AN EFFICIENT NONPARAMETRIC SOLUTION
    Chan, Hock Peng
    ANNALS OF STATISTICS, 2020, 48 (01): : 346 - 373
  • [6] Achieving Fairness in the Stochastic Multi-Armed Bandit Problem
    Patil, Vishakha
    Ghalme, Ganesh
    Nair, Vineet
    Narahari, Y.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [7] Mechanisms with learning for stochastic multi-armed bandit problems
    Shweta Jain
    Satyanath Bhat
    Ganesh Ghalme
    Divya Padmanabhan
    Y. Narahari
    Indian Journal of Pure and Applied Mathematics, 2016, 47 : 229 - 272
  • [8] MECHANISMS WITH LEARNING FOR STOCHASTIC MULTI-ARMED BANDIT PROBLEMS
    Jain, Shweta
    Bhat, Satyanath
    Ghalme, Ganesh
    Padmanabhan, Divya
    Narahari, Y.
    INDIAN JOURNAL OF PURE & APPLIED MATHEMATICS, 2016, 47 (02): : 229 - 272
  • [9] Achieving Fairness in the Stochastic Multi-Armed Bandit Problem
    Patil, Vishakha
    Ghalme, Ganesh
    Nair, Vineet
    Narahari, Y.
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5379 - 5386
  • [10] Achieving fairness in the stochastic multi-armed bandit problem
    Patil, Vishakha
    Ghalme, Ganesh
    Nair, Vineet
    Narahari, Y.
    1600, Microtome Publishing (22): : 1 - 31