An extended ε-constraint method for a multiobjective finite-horizon Markov decision process

被引:2
作者
Eghbali-Zarch, Maryam [1 ]
Tavakkoli-Moghaddam, Reza [1 ]
Azaron, Amir [2 ,3 ]
Dehghan-Sanej, Kazem [4 ]
机构
[1] Univ Tehran, Coll Engn, Sch Ind Engn, Tehran 1349957131, Iran
[2] Kwantlen Polytech Univ, Sch Business, Vancouver, BC V3W 2M8, Canada
[3] Univ British Columbia, Sauder Sch Business, Vancouver, BC V6T 1Z4, Canada
[4] Islamic Azad Univ, Sci & Res Branch, Dept Ind Engn, Tehran 1477893855, Iran
关键词
K‐ best policies algorithm; Markov decision process; weighted‐ sum method; ϵ ‐ constraint method; OBJECTIVE OPTIMIZATION; SUPPLY CHAIN;
D O I
10.1111/itor.12989
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
A Markov decision process (MDP) is an appropriate mathematical framework for analysis and modeling a large class of sequential decision-making problems. Real-world applications necessitate the evaluation of the value of a decision according to several conflicting objectives. This paper presents an extended epsilon-constraint method for a multiobjective finite-horizon MDP. This study integrates the epsilon-constraint method with the K-best policies algorithm to find the nondominated deterministic Markovian policies on the Pareto-optimal frontier. The proposed algorithm is evaluated on biobjective maintenance scheduling and machine running speed selection problems, and its performance is compared with a classic approach in the literature (weighted-sum, WS, method). Satisfying results show that the proposed algorithm obtains a good-quality Pareto frontier and has advantages over the WS method.
引用
收藏
页码:3131 / 3160
页数:30
相关论文
共 41 条
  • [1] Interactive fuzzy goal programming for multi-objective transportation problems
    Abd El-Wahed, WF
    Lee, SM
    [J]. OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE, 2006, 34 (02): : 158 - 166
  • [2] Abu Jadayil W, 2017, COGENT ENG, V4, DOI 10.1080/23311916.2017.1389831
  • [3] A multi objective optimization approach for flexible job shop scheduling problem under random machine breakdown by evolutionary algorithms
    Ahmadi, Ehsan
    Zandieh, Mostafa
    Farrokh, Mojtaba
    Emami, Seyed Mohammad
    [J]. COMPUTERS & OPERATIONS RESEARCH, 2016, 73 : 56 - 66
  • [4] Aruldoss M., 2013, AM J INFORM SYST, V1, P31, DOI [10.12691/ajis-1-1-5, DOI 10.12691/AJIS-1-1-5]
  • [5] Performance indicators in multiobjective optimization
    Audet, Charles
    Bigeon, Jean
    Cartier, Dominique
    Le Digabel, Sebastien
    Salomon, Ludovic
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2021, 292 (02) : 397 - 422
  • [6] Bechikh S, 2010, IEEE C EVOL COMPUTAT
  • [8] An exact ε-constraint method for bi-objective combinatorial optimization problems: Application to the Traveling Salesman Problem with Profits
    Berube, Jean-Francois
    Gendreau, Michel
    Potvin, Jean-Yves
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2009, 194 (01) : 39 - 50
  • [9] A multiobjective reinforcement learning approach to water resources systems operation: Pareto frontier approximation in a single run
    Castelletti, A.
    Pianosi, F.
    Restelli, M.
    [J]. WATER RESOURCES RESEARCH, 2013, 49 (06) : 3476 - 3486
  • [10] Chatterjee K, 2006, LECT NOTES COMPUT SC, V3884, P325