Optimistic Sampling Strategy for Data-Efficient Reinforcement Learning

被引：2

作者：

Zhao, Dongfang ^{[1
]}

Liu, Jiafeng ^{[1
]}

Wu, Rui ^{[1
]}

Cheng, Dansong ^{[1
]}

Tang, Xianglong ^{[1
]}

机构：

[1] Harbin Inst Technol, Harbin, Heilongjiang, Peoples R China

来源：

IEEE ACCESS | 2019年 / 7卷

基金：

美国国家科学基金会;

关键词：

Reinforcement learning; information entropy; optimistic sampling; data efficiency;

D O I：

10.1109/ACCESS.2019.2913001

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A high required number of interactions with the environment is one of the most important problems in reinforcement learning (RL). To deal with this problem, several data-efficient RL algorithms have been proposed and successfully applied in practice. Unlike previous research, that focuses on optimal policy evaluation and policy improvement stages, we actively select informative samples by leveraging entropy-based optimal sampling strategy, which takes the initial samples set into consideration. During the initial sampling process, information entropy is used to describe the potential samples. The agent selects the most informative samples using an optimization method. This way, the initial sample is more informative than in random and fixed strategy. Therefore, a more accurate initial dynamic model and policy can be learned. Thus, the proposed optimal sampling method guides the agent to search in a more informative region. The experimental results on standard benchmark problems involving a pendulum, cart pole, and cart double pendulum show that our optimal sampling strategy has a better performance in terms of data efficiency.

引用

页码：55763 / 55769

页数：7

共 50 条

[31] Data-Efficient Graph Learning
Ding, Kaize
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22663 - 22663
[32] Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning
Luck, Kevin Sebastian
Ben Amor, Heni
Calandra, Roberto
CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
[33] Mix-up Consistent Cross Representations for Data-Efficient Reinforcement Learning
Liu, Shiyu
Cao, Guitao
Liu, Yong
Li, Yan
Wu, Chunwei
Xi, Xidong
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[34] Data-efficient deep reinforcement learning with expert demonstration for active flow control
Zheng, Changdong
Xie, Fangfang
Ji, Tingwei
Zhang, Xinshuai
Lu, Yufeng
Zhou, Hongjie
Zheng, Yao
PHYSICS OF FLUIDS, 2022, 34 (11)
[35] Data-Efficient Reinforcement Learning for Energy Optimization of Power-Assisted Wheelchairs
Feng, Guoxi
Busoniu, Lucian
Guerra, Thierry-Marie
Mohammad, Sami
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2019, 66 (12) : 9734 - 9744
[36] Load Balancing for Communication Networks via Data-Efficient Deep Reinforcement Learning
Wu, Di
Kang, Jikun
Xu, Yi Tian
Li, Hang
Li, Jimmy
Chen, Xi
Rivkin, Dmitriy
Jenkin, Michael
Lee, Taeseop
Park, Intaik
Liu, Xue
Dudek, Gregory
2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
[37] SDRL: Interpretable and Data-Efficient Deep Reinforcement Learning Leveraging Symbolic Planning
Lyu, Daoming
Yang, Fangkai
Liu, Bo
Gustafson, Steven
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 2970 - 2977
[38] SDRL: Interpretable and Data-efficient Deep Reinforcement Learning Leveraging Symbolic Planning
Lyu, Daoming
Yang, Fangkai
Liu, Bo
Gustafson, Steven
ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2019, (306): : 354 - 354
[39] A Survey of Data-Efficient Graph Learning
Ju, Wei
Yi, Siyu
Wang, Yifan
Long, Qingqing
Luo, Junyu
Xiao, Zhiping
Zhang, Ming
PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 8104 - 8113
[40] Uniform Priors for Data-Efficient Learning
Sinha, Samarth
Roth, Karsten
Goyal, Anirudh
Ghassemi, Marzyeh
Akata, Zeynep
Larochelle, Hugo
Garg, Animesh
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4026 - 4037

← 1 2 3 4 5 →