Efficient approximate linear programming for factored MDPs

被引：5

作者：

Chen, Feng ^{[1
,2
]}

Cheng, Qiang ^{[3
]}

Dong, Jianwu ^{[1
]}

Yu, Zhaofei ^{[1
]}

Wang, Guojun ^{[4
]}

Xu, Wenli ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Automat, Ctr Brain Inspired Comp Res, Beijing 100084, Peoples R China

[2] Beijing Key Lab Secur Big Data Proc & Applicat, Beijing, Peoples R China

[3] Nanjing Res Inst Elect Technol, Nanjing, Jiangsu, Peoples R China

[4] Nanjing Univ, Sch Management & Engn, Nanjing 210008, Jiangsu, Peoples R China

来源：

INTERNATIONAL JOURNAL OF APPROXIMATE REASONING | 2015年 / 63卷

基金：

中国国家自然科学基金;

关键词：

Factored MDPs; Approximate linear programming; Junction graph; Cluster constraints; ALGORITHM;

D O I：

10.1016/j.ijar.2015.06.002

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Factored Markov Decision Processes (MDPs) provide a compact representation for modeling sequential decision making problems with many variables. Approximate linear programming (LP) is a prominent method for solving factored MDPs. However, it cannot be applied to models with large treewidth due to the exponential number of constraints. This paper proposes a novel and efficient approximate method to represent the exponentially many constraints. We construct an augmented junction graph from the factored MDP, and represent the constraints using a set of cluster constraints and separator constraints, where the cluster constraints play the role of reducing the number of constraints, and the separator constraints enforce the consistency of neighboring clusters so as to improve the accuracy. In the case where the junction graph is tree-structured, our method provides an equivalent representation to the original constraints. In other cases, our method provides a good trade-off between computation and accuracy. Experimental results on different models show that our algorithm performs better than other approximate linear programming algorithms on computational cost or expected reward. (C) 2015 Elsevier Inc. All rights reserved.

引用

页码：101 / 121

页数：21

共 32 条

[1]

[Anonymous], 2002, P 8 ACM SIGKDD INT C

[2]

BAHAR RI, 1993, 1993 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN - DIGEST OF TECHNICAL PAPERS, P188, DOI 10.1109/ICCAD.1993.580054

[3]

Barber D., 2011, Bayesian Reasoning and Machine Learning

[4]

Bertele U., 1972, NONSERIAL DYNAMIC PR

[5] Stochastic dynamic programming with factored representations [J].

Boutilier, C ;

Dearden, R ;

Goldszmidt, M .

ARTIFICIAL INTELLIGENCE, 2000, 121 (1-2) :49-107

[6]

CHENG Q, 2013, ADV NEURAL INFORM PR, V26, P2976

[7] On constraint sampling in the linear programming approach to approximate dynamic programming [J].

de Farias, DP ;

Van Roy, B .

MATHEMATICS OF OPERATIONS RESEARCH, 2004, 29 (03) :462-478

[8]

Degris T., 2010, FACTORED MARKOV DECI

[9] Efficient solutions to factored MDPs with imprecise transition probabilities [J].

Delgado, Karina Valdivia ;

Sanner, Scott ;

de Barros, Leliane Nunes .

ARTIFICIAL INTELLIGENCE, 2011, 175 (9-10) :1498-1527

[10]

Forsell N., 2006, P EUR C ART INT

← 1 2 3 4 →