Markov Decision Process in the Problem of Dynamic Pricing Policy

被引：5

作者：

Chizhov, Yu. A. ^{[1
]}

Borisov, A. N. ^{[1
]}

机构：

[1] Riga Tech Univ, 1 Kalku Str, LV-1658 Riga, Latvia

来源：

AUTOMATIC CONTROL AND COMPUTER SCIENCES | 2011年 / 45卷 / 06期

关键词：

Markov decision process; dynamic pricing policy; MDP model construction;

D O I：

10.3103/S0146411611060058

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Markov decision processes (MDP) are widely used in problems whose solutions may be represented by a certain series of actions. A lot of papers demonstrate successful MDP use in model problems, robotic control problems, planning problems, etc. In addition, economic problems have the property of multistep motion towards a goal as well. This paper is dedicated to MDP application to the problem of pricing policy management. The problem of dynamic pricing is stated in terms of MDP. Additional attention is paid to the method of constructing an MDP model based on data mining. Based on the data on sales of an actual industrial plant, construction of an MDP model that includes the searching for and generalization of regularities is demonstrated.

引用

页码：361 / 371

页数：11

共 10 条

[1]

BELLMAN R, 1965, PRIKLADNYE ZADACHI D

[2]

Chizhov J., 2010, P 9 INT C APPL FUZZ, P79

[3]

Chizhov Y., 2009, WORKSH P DAT MIN MAR, P9

[4]

Fomin G. P., 2005, MATEMATICHESKIE METO

[5]

Konar A, 2005, COMPUTATIONAL INTELL

[6]

Li C, 2006, WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, P6993

[7]

Sutton R. S., 1998, INTRO REINFORCEMENT, V2

[8]

Taha H.A., 2002, OPERATION RES INTRO

[9]

Varian H. R., 1996, 1 MONDAY, V1, P2

[10] A gradient-based reinforcement learning approach to dynamic pricing in partially-observable environments [J].

Vengerov, David .

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2008, 24 (07) :687-693

← 1 →