Model-based offline reinforcement learning for sustainable fishery management

被引:0
|
作者
Ju, Jun [1 ,3 ]
Kurniawati, Hanna [2 ]
Kroese, Dirk [1 ]
Ye, Nan [1 ,3 ]
机构
[1] Univ Queensland, Sch Math & Phys, St Lucia, Qld, Australia
[2] Australian Natl Univ, Sch Comp, Canberra, ACT, Australia
[3] Univ Queensland, Sch Math & Phys, St Lucia, Qld 4072, Australia
基金
澳大利亚研究理事会;
关键词
Beverton-Holt model; fishery management; incomplete data; model misspecification; offline reinforcement learning; POMDP; Schaefer model; ADAPTIVE MANAGEMENT; DECISION; UNCERTAINTY; INFERENCE;
D O I
10.1111/exsy.13324
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fisheries, as indispensable natural resources for human, need to be managed with both short-term economical benefits and long-term sustainability in consideration. This has remained a challenge, because the population and catch dynamics of the fisheries are complex and noisy, while the data available is often scarce and only provides partial information on the dynamics. To address these challenges, we formulate the population and catch dynamics as a Partially Observable Markov Decision Process (POMDP), and propose a model-based offline reinforcement learning approach to learn an optimal management policy. Our approach allows learning fishery management policies from possibly incomplete fishery data generated by a stochastic fishery system. This involves first learning a POMDP fishery model using a novel least squares approach, and then computing the optimal policy for the learned POMDP. The learned fishery dynamics model is useful for explaining the resulting policy's performance. We perform systematic and comprehensive simulation study to quantify the effects of stochasticity in fishery dynamics, proliferation rates, missing values in fishery data, dynamics model misspecification, and variability of effort (e.g., the number of boat days). When the effort is sufficiently variable and the noise is moderate, our method can produce a competitive policy that achieves 85% of the optimal value, even for the hardest case of noisy incomplete data and a misspecified model. Interestingly, the learned policies seem to be robust in the presence of model learning errors. However, non-identifiability kicks in if there is insufficient variability in the effort level and the fishery system is stochastic. This often results in poor policies, highlighting the need for sufficiently informative data. We also provide a theoretical analysis on model misspecification and discuss the tendency of a Schaefer model to overfit compared with a Beverton-Holt model.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief
    Guo, Kaiyang
    Shao, Yunfeng
    Geng, Yanhui
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [22] On the Sample Complexity of Vanilla Model-Based Offline Reinforcement Learning with Dependent Samples
    Karabag, Mustafa O.
    Topcu, Ufuk
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8195 - 8202
  • [23] Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games
    Yan, Yuling
    Li, Gen
    Chen, Yuxin
    Fan, Jianqing
    OPERATIONS RESEARCH, 2024, 72 (06) : 2430 - 2445
  • [24] Model-based Bayesian Reinforcement Learning for Dialogue Management
    Lison, Pierre
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 475 - 479
  • [25] Model-based offline reinforcement learning framework for optimizing tunnel boring machine operation
    Cao, Yupeng
    Luo, Wei
    Xue, Yadong
    Lin, Weiren
    Zhang, Feng
    UNDERGROUND SPACE, 2024, 19 : 47 - 71
  • [26] Differentiable Physics Models for Real-world Offline Model-based Reinforcement Learning
    Lutter, Michael
    Silberbauer, Johannes
    Watson, Joe
    Peters, Jan
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4163 - 4170
  • [27] Model-Based Reinforcement Learning with Automated Planning for Network Management
    Ordonez, Armando
    Mauricio Caicedo, Oscar
    Villota, William
    Rodriguez-Vivas, Angela
    da Fonseca, Nelson L. S.
    SENSORS, 2022, 22 (16)
  • [28] Offline Model-based Adaptable Policy Learning
    Chen, Xiong-Hui
    Yu, Yang
    Li, Qingyang
    Luo, Fan-Ming
    Qin, Zhiwei
    Shang, Wenjie
    Ye, Jieping
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [29] Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement Learning
    Hishinuma, Toru
    Senda, Kei
    IEEE ACCESS, 2023, 11 : 145579 - 145590
  • [30] Distributionally Robust Model-Based Offline Reinforcement Learning with Near-Optimal Sample Complexity
    Shi, Laixi
    Chi, Yuejie
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25