Selective Dyna-Style Planning Under Limited Model Capacity

被引：0

作者：

Abbas, Zaheer ^{[1
,2
]}

Sokota, Samuel ^{[1
,2
]}

Talvitie, Erin J. ^{[3
]}

White, Martha ^{[1
,2
]}

机构：

[1] Univ Alberta, Edmonton, AB, Canada

[2] Alberta Machine Intelligence Inst Amii, Edmonton, AB, Canada

[3] Harvey Mudd Coll, Claremont, CA 91711 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119 | 2020年 / 119卷

基金：

美国国家科学基金会;

关键词：

ARCADE LEARNING-ENVIRONMENT; DROPOUT;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but refrain from using the model where it would be harmful. An effective selective planning mechanism requires estimating predictive uncertainty, which arises out of aleatoric uncertainty, parameter uncertainty, and model inadequacy, among other sources. Prior work has focused on parameter uncertainty for selective planning. In this work, we emphasize the importance of model inadequacy. We show that heteroscedastic regression can signal predictive uncertainty arising from model inadequacy that is complementary to that which is detected by methods designed for parameter uncertainty, indicating that considering both parameter uncertainty and model inadequacy may be a more promising direction for effective selective planning than either in isolation.

引用

页数：10

共 50 条

[1] Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models
Aminmansour, Farzane
Jafferjee, Taher
Imani, Ehsan
Talvitie, Erin J.
Bowling, Michael
White, Martha
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 80 : 441 - 473
[2] Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models
Aminmansour F.
Jafferjee T.
Imani E.
Talvitie E.J.
Bowling M.
White M.
Journal of Artificial Intelligence Research, 2024, 80 : 441 - 473
[3] Intelligent Trainer for Dyna-Style Model-Based Deep Reinforcement Learning
Dong, Linsen
Li, Yuanlong
Zhou, Xin
Wen, Yonggang
Guan, Kyle
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2758 - 2771
[4] TADS: Learning Time-Aware Scheduling Policy with Dyna-Style Planning for Spaced Repetition
Yang, Zhengyu
Shen, Jian
Liu, Yunfei
Yang, Yang
Zhang, Weinan
Yu, Yong
PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1917 - 1920
[5] Dyna-style Model-based reinforcement learning with Model-Free Policy Optimization
Dong, Kun
Luo, Yongle
Wang, Yuxin
Liu, Yu
Qu, Chengeng
Zhang, Qiang
Cheng, Erkang
Sun, Zhiyong
Song, Bo
KNOWLEDGE-BASED SYSTEMS, 2024, 287
[6] Physics-informed Dyna-style model-based deep reinforcement learning for dynamic control
Liu, Xin-Yang
Wang, Jian-Xun
PROCEEDINGS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2021, 477 (2255):
[7] Capacity planning with limited information
Anand, Vic
Balakrishnan, Ramji
Gavirneni, Srinagesh
PRODUCTION AND OPERATIONS MANAGEMENT, 2023, 32 (09) : 2740 - 2757
[8] An EOQ model with limited storage capacity under trade credits
Ouyang, Liang-Yuh
Wu, Kun-Shan
Yang, Chih-Te
ASIA-PACIFIC JOURNAL OF OPERATIONAL RESEARCH, 2007, 24 (04) : 575 - 592
[9] Robust model predictive control under limited capacity communication constraints
Savkovic, Borislav
2009 IEEE INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION, VOLS 1-3, 2009, : 1553 - 1558
[10] A stochastic model for operating room planning under capacity constraints
Jebali, Aida
Diabat, Ali
INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2015, 53 (24) : 7252 - 7270

← 1 2 3 4 5 →