Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing

被引：0

作者：

Abbasi-Yadkori, Yasin ^{[1
]}

Bartlett, Peter L. ^{[1
,2
]}

Chen, Xi ^{[3
]}

Malek, Alan ^{[2
]}

机构：

[1] Queensland Univ Technol, Brisbane, Qld 4001, Australia

[2] Univ Calif Berkeley, Berkeley, CA 94720 USA

[3] NYU, Stern Sch Business, New York, NY 10003 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37 | 2015年 / 37卷

基金：

澳大利亚研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical cost of finding the optimal policy scales with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can recast policy optimization as a convex optimization and solve it approximately using a stochastic subgradient algorithm. This method scales in complexity with the family of policies but not the state space. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by optimizing a policy for budget allocation in crowd labeling, an important crowd-sourcing application.

引用

页码：1053 / 1062

页数：10

共 50 条

[41] PRACTICAL PROBLEMS IN APPLICATION OF CONTROL-SYSTEMS IN LARGE-SCALE WASTEWATER-TREATMENT PROGRAMS
GARBER, WF
PROGRESS IN WATER TECHNOLOGY, 1978, 9 (5-6): : 11 - 12
[42] Simulation-based policy generation using large-scale Markov decision processes
Zobel, CW
Scherer, WT
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2001, 31 (06): : 609 - 622
[43] On State Aggregation to Approximate Complex Value Functions in Large-Scale Markov Decision Processes
Jia, Qing-Shan
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2011, 56 (02) : 333 - 344
[44] Faster saddle-point optimization for solving large-scale Markov decision processes
Bas-Serrano, Joan
Neu, Gergely
LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 413 - 423
[45] Kernelized Q-Learning for Large-Scale, Potentially Continuous, Markov Decision Processes
Sledge, Isaac J.
Principe, Jose C.
2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, : 153 - 162
[46] PROBLEMS OF LARGE-SCALE - INTRODUCTION
HOEKSTRA, HA
GEDRAG-TIJDSCHRIFT VOOR PSYCHOLOGIE, 1983, 11 (05): : 211 - 212
[47] PROBLEMS OF LARGE-SCALE SIMULATIONS
HENDRICK.F
EKISTICS-THE PROBLEMS AND SCIENCE OF HUMAN SETTLEMENTS, 1974, 37 (222): : 312 - 315
[48] Large-scale control
Rachel Won
Nature Photonics, 2012, 6 (3) : 138 - 138
[49] LARGE-SCALE CONTROL PROBLEMS IN ELECTRIC-POWER SYSTEMS
QUAZZA, G
AUTOMATICA, 1977, 13 (06) : 579 - 593
[50] Krylov subspace methods for large-scale matrix problems in control
Datta, BN
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2003, 19 (07): : 1253 - 1263

← 1 2 3 4 5 →