Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing

被引:0
|
作者
Abbasi-Yadkori, Yasin [1 ]
Bartlett, Peter L. [1 ,2 ]
Chen, Xi [3 ]
Malek, Alan [2 ]
机构
[1] Queensland Univ Technol, Brisbane, Qld 4001, Australia
[2] Univ Calif Berkeley, Berkeley, CA 94720 USA
[3] NYU, Stern Sch Business, New York, NY 10003 USA
基金
澳大利亚研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical cost of finding the optimal policy scales with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can recast policy optimization as a convex optimization and solve it approximately using a stochastic subgradient algorithm. This method scales in complexity with the family of policies but not the state space. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by optimizing a policy for budget allocation in crowd labeling, an important crowd-sourcing application.
引用
收藏
页码:1053 / 1062
页数:10
相关论文
共 50 条
  • [31] Cost control pattern for large-scale power plant manufacture
    Li, XY
    Sun, W
    Zhang, HL
    PROCEEDINGS OF 2002 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE & ENGINEERING, VOLS I AND II, 2002, : 1904 - 1908
  • [32] A reanalysis method for local modification and the application in large-scale problems
    Guanxin Huang
    Hu Wang
    Guangyao Li
    Structural and Multidisciplinary Optimization, 2014, 49 : 915 - 930
  • [33] Parallelization of perturbation analysis: Application to large-scale engineering problems
    Khanin, R
    Cartmell, M
    JOURNAL OF SYMBOLIC COMPUTATION, 2001, 31 (04) : 461 - 473
  • [34] CombNET-III with nonlinear gating network and its application in large-scale classification problems
    Kugler, Mauricio
    Kuroyanagi, Susumu
    Nugroho, Anto Satriyo
    Iwata, Akira
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (02) : 286 - 295
  • [35] Control strategy for large-scale fires near power transmission lines and its application
    Wu, Chuanping
    Lu, Jiazheng
    Chen, Baohui
    Liu, Yu
    Zhou, Tiannian
    Zhou, Tejun
    IET GENERATION TRANSMISSION & DISTRIBUTION, 2021, 15 (04) : 704 - 715
  • [36] A reanalysis method for local modification and the application in large-scale problems
    Huang, Guanxin
    Wang, Hu
    Li, Guangyao
    STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2014, 49 (06) : 915 - 930
  • [37] Application of Shakedown Analysis to Large-Scale Problems with Selective Algorithm
    Hachemi, A.
    Mouhtamid, S.
    Nguyen, A. D.
    Weichert, D.
    LIMIT STATES OF MATERIALS AND STRUCTURES: DIRECT METHODS, 2009, : 289 - 305
  • [38] Application of the lattice Boltzmann method for large-scale hydraulic problems
    Biscarini, Chiara
    Di Francesco, Silvia
    Mencattini, Matteo
    INTERNATIONAL JOURNAL OF NUMERICAL METHODS FOR HEAT & FLUID FLOW, 2011, 21 (05) : 584 - 601
  • [39] Application Specific Traffic Control in Large-Scale Disasters
    Tairaku, Tsumugi
    Nakao, Akihiro
    Yamamoto, Shu
    Yamaguchi, Saneyasu
    Oguchi, Masato
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 4838 - 4840
  • [40] THE APPLICATION OF LARGE-SCALE SYSTEMS OPTIMIZATION CONTROL ALGORITHMS
    BAKALIS, PS
    ELLIS, JE
    APPLIED MATHEMATICAL MODELLING, 1992, 16 (04) : 201 - 207