Least squares approximate policy iteration for learning bid prices in choice-based revenue management

被引：11

作者：

Koch, Sebastian ^{[1
]}

机构：

[1] Univ Augsburg, Chair Analyt & Optimizat, Univ Str 16, D-86159 Augsburg, Germany

来源：

COMPUTERS & OPERATIONS RESEARCH | 2017年 / 77卷

关键词：

Revenue management; Capacity control; Approximate dynamic programming; Approximate policy iteration; DYNAMIC-PROGRAMMING ALGORITHM; VIRTUAL NESTING CONTROLS; CAPACITY CONTROL; SIMULATION; MODEL;

D O I：

10.1016/j.cor.2016.07.015

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

We consider the revenue management problem of capacity control under customer choice behavior. An exact solution of the underlying stochastic dynamic program is difficult because of the multi-dimensional state space and, thus, approximate dynamic programming (ADP) techniques are widely used. The key idea of ADP is to encode the multi-dimensional state space by a small number of basis functions, often leading to a parametric approximation of the dynamic program's value function. In general, two classes of ADP techniques for learning value function approximations exist: mathematical programming and simulation. So far, the literature on capacity control largely focuses on the first class. In this paper, we develop a least squares approximate policy iteration (API) approach which belongs to the second class. Thereby, we suggest value function approximations that are linear in the parameters, and we estimate the parameters via linear least squares regression. Exploiting both exact and heuristic knowledge from the value function, we enforce structural constraints on the parameters to facilitate learning a good policy. We perform an extensive simulation study to investigate the performance of our approach. The results show that it is able to obtain competitive revenues compared to and often out-performs state-of-the-art capacity control methods in reasonable computational time. Depending on the scarcity of capacity and the point in time, revenue improvements of around 1% or more can be observed. Furthermore, the proposed approach contributes to simulation-based ADP, bringing forth research on numerically estimating piecewise linear value function approximations and their application in revenue management environments. (C) 2016 Elsevier Ltd. All rights reserved.

引用

页码：240 / 253

页数：14

共 48 条

[1] Dynamic bid prices in revenue management [J].

Adelman, Daniel .

OPERATIONS RESEARCH, 2007, 55 (04) :647-661

[2] A price-directed heuristic for the economic lot scheduling problem [J].

Adelman, Daniel ;

Barz, Christiane .

IIE TRANSACTIONS, 2014, 46 (12) :1343-1356

[3] Bid-Price Controls for Network Revenue Management: Martingale Characterization of Optimal Bid Prices [J].

Akan, Mustafa ;

Ata, Baris .

MATHEMATICS OF OPERATIONS RESEARCH, 2009, 34 (04) :912-936

[4] Single-leg air-cargo revenue management [J].

Amaruchkul, Kannapha ;

Cooper, William L. ;

Gupta, Diwakar .

TRANSPORTATION SCIENCE, 2007, 41 (04) :457-469

[5]

[Anonymous], 2012, DYNAMIC PROGRAMMING

[6]

[Anonymous], 2003, J. Mach. Learn. Res.

[7]

[Anonymous], 2011, Approximate dynamic programming: Solving the curses of dimensionality

[8] Hierarchical Multi-skill Resource Assignment in the Telecommunications Industry [J].

Barz, Christiane ;

Kolisch, Rainer .

PRODUCTION AND OPERATIONS MANAGEMENT, 2014, 23 (03) :489-503

[9]

Bertsimas D., 1997, INTRO LINEAR OPTIMIZ, V6

[10] Computing Bid Prices for Revenue Management Under Customer Choice Behavior [J].

Chaneton, Juan M. ;

Vulcano, Gustavo .

M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT, 2011, 13 (04) :452-470

← 1 2 3 4 5 →