An extended ε-constraint method for a multiobjective finite-horizon Markov decision process

被引：2

作者：

Eghbali-Zarch, Maryam ^{[1
]}

Tavakkoli-Moghaddam, Reza ^{[1
]}

Azaron, Amir ^{[2
,3
]}

Dehghan-Sanej, Kazem ^{[4
]}

机构：

[1] Univ Tehran, Coll Engn, Sch Ind Engn, Tehran 1349957131, Iran

[2] Kwantlen Polytech Univ, Sch Business, Vancouver, BC V3W 2M8, Canada

[3] Univ British Columbia, Sauder Sch Business, Vancouver, BC V6T 1Z4, Canada

[4] Islamic Azad Univ, Sci & Res Branch, Dept Ind Engn, Tehran 1477893855, Iran

来源：

INTERNATIONAL TRANSACTIONS IN OPERATIONAL RESEARCH | 2022年 / 29卷 / 05期

关键词：

K‐ best policies algorithm; Markov decision process; weighted‐ sum method; ϵ ‐ constraint method; OBJECTIVE OPTIMIZATION; SUPPLY CHAIN;

D O I：

10.1111/itor.12989

中图分类号：

C93 [管理学];

学科分类号：

12 ; 1201 ; 1202 ; 120202 ;

摘要：

A Markov decision process (MDP) is an appropriate mathematical framework for analysis and modeling a large class of sequential decision-making problems. Real-world applications necessitate the evaluation of the value of a decision according to several conflicting objectives. This paper presents an extended epsilon-constraint method for a multiobjective finite-horizon MDP. This study integrates the epsilon-constraint method with the K-best policies algorithm to find the nondominated deterministic Markovian policies on the Pareto-optimal frontier. The proposed algorithm is evaluated on biobjective maintenance scheduling and machine running speed selection problems, and its performance is compared with a classic approach in the literature (weighted-sum, WS, method). Satisfying results show that the proposed algorithm obtains a good-quality Pareto frontier and has advantages over the WS method.

引用

页码：3131 / 3160

页数：30

共 41 条

[1] Interactive fuzzy goal programming for multi-objective transportation problems
Abd El-Wahed, WF
Lee, SM
[J]. OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE, 2006, 34 (02): : 158 - 166
[2] Abu Jadayil W, 2017, COGENT ENG, V4, DOI 10.1080/23311916.2017.1389831
[3] A multi objective optimization approach for flexible job shop scheduling problem under random machine breakdown by evolutionary algorithms
Ahmadi, Ehsan
Zandieh, Mostafa
Farrokh, Mojtaba
Emami, Seyed Mohammad
[J]. COMPUTERS & OPERATIONS RESEARCH, 2016, 73 : 56 - 66
[4] Aruldoss M., 2013, AM J INFORM SYST, V1, P31, DOI [10.12691/ajis-1-1-5, DOI 10.12691/AJIS-1-1-5]
[5] Performance indicators in multiobjective optimization
Audet, Charles
Bigeon, Jean
Cartier, Dominique
Le Digabel, Sebastien
Salomon, Ludovic
[J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2021, 292 (02) : 397 - 422
[6] Bechikh S, 2010, IEEE C EVOL COMPUTAT
[7] DYNAMIC PROGRAMMING AND LAGRANGE MULTIPLIERS
BELLMAN, R
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1956, 42 (10) : 767 - 769
[8] An exact ε-constraint method for bi-objective combinatorial optimization problems: Application to the Traveling Salesman Problem with Profits
Berube, Jean-Francois
Gendreau, Michel
Potvin, Jean-Yves
[J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2009, 194 (01) : 39 - 50
[9] A multiobjective reinforcement learning approach to water resources systems operation: Pareto frontier approximation in a single run
Castelletti, A.
Pianosi, F.
Restelli, M.
[J]. WATER RESOURCES RESEARCH, 2013, 49 (06) : 3476 - 3486
[10] Chatterjee K, 2006, LECT NOTES COMPUT SC, V3884, P325

← 1 2 3 4 5 →