Reinforcement learning for decision-making under deep uncertainty

被引:5
作者
Pei, Zhihao [1 ,4 ]
Rojas-Arevalo, Angela M. [3 ]
de Haan, Fjalar J. [1 ,2 ]
Lipovetzky, Nir [1 ]
Moallemi, Enayat A. [3 ]
机构
[1] Univ Melbourne, Fac Engn & Informat Technol, Sch Comp & Informat Syst, Melbourne, Australia
[2] Univ Melbourne, Melbourne Ctr Data Sci, Melbourne, Australia
[3] Commonwealth Sci & Ind Res Org CSIRO, Melbourne, Australia
[4] 700 Swanston St, Carlton, Vic 3053, Australia
关键词
Deep uncertainty; Exploratory modeling; Reinforcement learning; Multi-objective evolutionary algorithm; Adaptation; Robustness; CLIMATE-CHANGE UNCERTAINTIES; ADAPTIVE POLICY PATHWAYS; INFO-GAP; ROBUST; MANAGEMENT; ADAPTATION; ALGORITHMS; SEARCH;
D O I
10.1016/j.jenvman.2024.120968
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Planning under complex uncertainty often asks for plans that can adapt to changing future conditions. To inform plan development during this process, exploration methods have been used to explore the performance of candidate policies given uncertainties. Nevertheless, these methods hardly enable adaptation by themselves, so extra efforts are required to develop the final adaptive plans, hence compromising the overall decisionmaking efficiency. This paper introduces Reinforcement Learning (RL) that employs closed -loop control as a new exploration method that enables automated adaptive policy -making for planning under uncertainty. To investigate its performance, we compare RL with a widely -used exploration method, Multi -Objective Evolutionary Algorithm (MOEA), in two hypothetical problems via computational experiments. Our results indicate the complementarity of the two methods. RL makes better use of its exploration history, hence always providing higher efficiency and providing better policy robustness in the presence of parameter uncertainty. MOEA quantifies objective uncertainty in a more intuitive way, hence providing better robustness to objective uncertainty. These findings will help researchers choose appropriate methods in different applications.
引用
收藏
页数:13
相关论文
共 73 条
[1]  
[Anonymous], 2013, Exploratory modeling and analysis
[2]   Societal Ageing in the Netherlands: A Robust System Dynamics Approach [J].
Auping, Willem L. ;
Pruyt, Erik ;
Kwakkel, Jan H. .
SYSTEMS RESEARCH AND BEHAVIORAL SCIENCE, 2015, 32 (04) :485-501
[3]   EXPLORATORY MODELING FOR POLICY ANALYSIS [J].
BANKES, S .
OPERATIONS RESEARCH, 1993, 41 (03) :435-449
[4]   On considering robustness in the search phase of Robust Decision Making: A comparison of Many-Objective Robust Decision Making, multi-scenario Many-Objective Robust Decision Making, and Many Objective Robust Optimization [J].
Bartholomew, Erin ;
Kwakkel, Jan H. .
ENVIRONMENTAL MODELLING & SOFTWARE, 2020, 127
[5]  
Ben-Haim Y., 2006, Academic, DOI DOI 10.1016/B978-0-12-373552-2.X5000-0
[6]  
Carpenter SR, 1999, ECOL APPL, V9, P751, DOI 10.1890/1051-0761(1999)009[0751:MOEFLS]2.0.CO
[7]  
2
[8]   A multiobjective reinforcement learning approach to water resources systems operation: Pareto frontier approximation in a single run [J].
Castelletti, A. ;
Pianosi, F. ;
Restelli, M. .
WATER RESOURCES RESEARCH, 2013, 49 (06) :3476-3486
[9]   Tree-based reinforcement learning for optimal water reservoir operation [J].
Castelletti, A. ;
Galelli, S. ;
Restelli, M. ;
Soncini-Sessa, R. .
WATER RESOURCES RESEARCH, 2010, 46
[10]  
Castelletti A, 2012, IEEE IJCNN