Bias and variance approximation in value function estimates

被引:96
作者
Mannor, Shie [1 ]
Simester, Duncan
Sun, Peng
Tsitsiklis, John N.
机构
[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ H3A 2A7, Canada
[2] MIT, Sloan Sch Management, Cambridge, MA 02139 USA
[3] Duke Univ, Fuqua Sch Business, Durham, NC 27708 USA
[4] MIT, Informat & Decis Syst Lab, Cambridge, MA 02139 USA
关键词
value function; confidence interval; variance; bias; DYNAMIC-PROGRAMMING MODELS; MARKOV DECISION-PROCESSES; MANAGEMENT;
D O I
10.1287/mnsc.1060.0614
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
We consider a finite-state, finite-action, infinite-horizon, discounted reward Markov decision process and study the bias and variance in the value function estimates that result from empirical estimates of the model parameters. We provide closed-form approximations for the bias and variance, which can then be used to derive confidence intervals around the value function estimates. We illustrate and validate our findings using a large database describing the transaction and mailing histories for customers of a mail-order catalog firm.
引用
收藏
页码:308 / 322
页数:15
相关论文
共 24 条
[1]  
[Anonymous], 1998, Investment science
[2]  
[Anonymous], 2000, DYNAMIC PROGRAMMING
[3]   Investing for the long run when returns are predictable [J].
Barberis, N .
JOURNAL OF FINANCE, 2000, 55 (01) :225-264
[4]  
BAUKALGURSOY M, 1992, MATH OPER RES, V17, P558
[5]  
Bertsekas D., 1996, NEURO DYNAMIC PROGRA, V1st ed.
[6]   Mailing decisions in the catalog sales industry [J].
Bitran, GR ;
Mondschein, SV .
MANAGEMENT SCIENCE, 1996, 42 (09) :1364-1381
[7]   Optimal selection for direct mail [J].
Bult, JR ;
Wansbeek, T .
MARKETING SCIENCE, 1995, 14 (04) :378-394
[8]  
Campbell J.Y., 2002, Strategic asset allocation: portfolio choice for long-term investors
[9]   THE GREATEST OF A FINITE-SET OF RANDOM-VARIABLES [J].
CLARK, CE .
OPERATIONS RESEARCH, 1961, 9 (02) :145-162
[10]  
Dixit K., 1994, INVESTMENT UNCERTAIN