Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing

被引：0

作者：

Abbasi-Yadkori, Yasin ^{[1
]}

Bartlett, Peter L. ^{[1
,2
]}

Chen, Xi ^{[3
]}

Malek, Alan ^{[2
]}

机构：

[1] Queensland Univ Technol, Brisbane, Qld 4001, Australia

[2] Univ Calif Berkeley, Berkeley, CA 94720 USA

[3] NYU, Stern Sch Business, New York, NY 10003 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37 | 2015年 / 37卷

基金：

澳大利亚研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical cost of finding the optimal policy scales with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can recast policy optimization as a convex optimization and solve it approximately using a stochastic subgradient algorithm. This method scales in complexity with the family of policies but not the state space. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by optimizing a policy for budget allocation in crowd labeling, an important crowd-sourcing application.

引用

页码：1053 / 1062

页数：10

共 50 条

[1] Linear Programming for Large-Scale Markov Decision Problems
Abbasi-Yadkori, Yasin
Bartlett, Peter L.
Malek, Alan
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 496 - 504
[2] A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing
Tanczos, Ervin
Nowak, Robert
Mankoff, Bob
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[3] A METHODOLOGY FOR COMPUTATION REDUCTION FOR SPECIALLY STRUCTURED LARGE-SCALE MARKOV DECISION-PROBLEMS
DING, FY
HODGSON, TJ
KING, RE
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1988, 34 (01) : 105 - 112
[4] An Aggregation Procedure for Large-Scale Markov Decision Processes
Bartl, Ondrej
PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON MATHEMATICAL METHODS IN ECONOMICS 2004, 2004, : 9 - 15
[5] An application of simulation for large-scale Markov decision processes to a problem in telephone network routing
Zobel, C
Scherer, WT
1998 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5, 1998, : 2944 - 2955
[6] Solving large-scale control problems
Benner, P
IEEE CONTROL SYSTEMS MAGAZINE, 2004, 24 (01): : 44 - 59
[7] Cost Bounds for Pickup and Delivery Problems with Application to Large-Scale Transportation Systems
Treleaven, Kyle
Pavone, Marco
Frazzoli, Emilio
2012 AMERICAN CONTROL CONFERENCE (ACC), 2012, : 2120 - 2127
[8] On a class of large-scale cost-coupled Markov games with applications to decentralized power control
Huang, MY
Malhamé, RP
Caines, PE
2004 43RD IEEE CONFERENCE ON DECISION AND CONTROL (CDC), VOLS 1-5, 2004, : 2830 - 2835
[9] Application of Metaheuristics to Large-Scale Transportation Problems
D'Acierno, Luca
Gallo, Mariano
Montella, Bruno
LARGE-SCALE SCIENTIFIC COMPUTING, LSSC 2013, 2014, 8353 : 215 - 222
[10] Localized boundary knot method and its application to large-scale acoustic problems
Wang, Fajie
Gu, Yan
Qu, Wenzhen
Zhang, Chuanzeng
COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2020, 361

← 1 2 3 4 5 →