Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing

被引:0
|
作者
Abbasi-Yadkori, Yasin [1 ]
Bartlett, Peter L. [1 ,2 ]
Chen, Xi [3 ]
Malek, Alan [2 ]
机构
[1] Queensland Univ Technol, Brisbane, Qld 4001, Australia
[2] Univ Calif Berkeley, Berkeley, CA 94720 USA
[3] NYU, Stern Sch Business, New York, NY 10003 USA
基金
澳大利亚研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical cost of finding the optimal policy scales with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can recast policy optimization as a convex optimization and solve it approximately using a stochastic subgradient algorithm. This method scales in complexity with the family of policies but not the state space. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by optimizing a policy for budget allocation in crowd labeling, an important crowd-sourcing application.
引用
收藏
页码:1053 / 1062
页数:10
相关论文
共 50 条
  • [21] Perimeter Traffic Flow Control for a Multi-Region Large-Scale Traffic Network With Markov Decision Process
    Xu, Yunwen
    Li, Dewei
    Xi, Yugeng
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (06) : 4809 - 4821
  • [22] INTERACTIVE MULTIOBJECTIVE DECISION-MAKING FOR LARGE-SCALE SYSTEMS AND ITS APPLICATION TO ENVIRONMENTAL SYSTEMS
    SAKAWA, M
    SEO, F
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1980, 10 (12): : 796 - 806
  • [23] Redefined decision variable analysis method for large-scale optimization and its application to feature selection
    Li, Yongfeng
    Li, Lingjie
    Tang, Huimei
    Lin, Qiuzhen
    Ming, Zhong
    Leung, Victor C. M.
    SWARM AND EVOLUTIONARY COMPUTATION, 2023, 82
  • [24] Crowdsourcing based large-scale network anomaly detection
    Li, Yang
    Huang, Wenguang
    Tian, Xiaohua
    2018 10TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2018,
  • [25] SLADE: A Smart Large-Scale Task Decomposer in Crowdsourcing
    Tong, Yongxin
    Chen, Lei
    Zhou, Zimu
    Jagadish, H. V.
    Shou, Lidan
    Lv, Weifeng
    2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 2133 - 2134
  • [26] Crowdsourcing for large-scale mosquito (Diptera: Culicidae) sampling
    Maki, Elin C.
    Cohnstaedt, Lee W.
    CANADIAN ENTOMOLOGIST, 2015, 147 (01): : 118 - 123
  • [27] CrowdLink: Crowdsourcing for Large-Scale Linked Data Management
    Basharat, Amna
    Arpinar, I. Budak
    Dastgheib, Shima
    Kursuncu, Ugur
    Kochut, Krys
    Dogdu, Erdogan
    2014 IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2014, : 227 - 234
  • [28] Study on decision algorithm of monitoring and control in large-scale systems
    Li, Haolin
    Shanghai Ligong Daxue Xuebao/Journal of University of Shanghai for Science and Technology, 2000, 22 (01): : 35 - 37
  • [29] SLADE: A Smart Large-Scale Task Decomposer in Crowdsourcing
    Tong, Yongxin
    Chen, Lei
    Zhou, Zimu
    Jagadish, H. V.
    Shou, Lidan
    Lv, Weifeng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (08) : 1588 - 1601
  • [30] Inhibiting Disturbance Propagation in Large-scale Systems and Its Application to Power System Control
    Inoue, Masaki
    Suzumura, Mizuki
    Urata, Kengo
    2018 EUROPEAN CONTROL CONFERENCE (ECC), 2018, : 423 - 428