Optimistic Sampling Strategy for Data-Efficient Reinforcement Learning

被引:2
|
作者
Zhao, Dongfang [1 ]
Liu, Jiafeng [1 ]
Wu, Rui [1 ]
Cheng, Dansong [1 ]
Tang, Xianglong [1 ]
机构
[1] Harbin Inst Technol, Harbin, Heilongjiang, Peoples R China
基金
美国国家科学基金会;
关键词
Reinforcement learning; information entropy; optimistic sampling; data efficiency;
D O I
10.1109/ACCESS.2019.2913001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A high required number of interactions with the environment is one of the most important problems in reinforcement learning (RL). To deal with this problem, several data-efficient RL algorithms have been proposed and successfully applied in practice. Unlike previous research, that focuses on optimal policy evaluation and policy improvement stages, we actively select informative samples by leveraging entropy-based optimal sampling strategy, which takes the initial samples set into consideration. During the initial sampling process, information entropy is used to describe the potential samples. The agent selects the most informative samples using an optimization method. This way, the initial sample is more informative than in random and fixed strategy. Therefore, a more accurate initial dynamic model and policy can be learned. Thus, the proposed optimal sampling method guides the agent to search in a more informative region. The experimental results on standard benchmark problems involving a pendulum, cart pole, and cart double pendulum show that our optimal sampling strategy has a better performance in terms of data efficiency.
引用
收藏
页码:55763 / 55769
页数:7
相关论文
共 50 条
  • [31] Data-Efficient Graph Learning
    Ding, Kaize
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22663 - 22663
  • [32] Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning
    Luck, Kevin Sebastian
    Ben Amor, Heni
    Calandra, Roberto
    CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [33] Mix-up Consistent Cross Representations for Data-Efficient Reinforcement Learning
    Liu, Shiyu
    Cao, Guitao
    Liu, Yong
    Li, Yan
    Wu, Chunwei
    Xi, Xidong
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [34] Data-efficient deep reinforcement learning with expert demonstration for active flow control
    Zheng, Changdong
    Xie, Fangfang
    Ji, Tingwei
    Zhang, Xinshuai
    Lu, Yufeng
    Zhou, Hongjie
    Zheng, Yao
    PHYSICS OF FLUIDS, 2022, 34 (11)
  • [35] Data-Efficient Reinforcement Learning for Energy Optimization of Power-Assisted Wheelchairs
    Feng, Guoxi
    Busoniu, Lucian
    Guerra, Thierry-Marie
    Mohammad, Sami
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2019, 66 (12) : 9734 - 9744
  • [36] Load Balancing for Communication Networks via Data-Efficient Deep Reinforcement Learning
    Wu, Di
    Kang, Jikun
    Xu, Yi Tian
    Li, Hang
    Li, Jimmy
    Chen, Xi
    Rivkin, Dmitriy
    Jenkin, Michael
    Lee, Taeseop
    Park, Intaik
    Liu, Xue
    Dudek, Gregory
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [37] SDRL: Interpretable and Data-Efficient Deep Reinforcement Learning Leveraging Symbolic Planning
    Lyu, Daoming
    Yang, Fangkai
    Liu, Bo
    Gustafson, Steven
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 2970 - 2977
  • [38] SDRL: Interpretable and Data-efficient Deep Reinforcement Learning Leveraging Symbolic Planning
    Lyu, Daoming
    Yang, Fangkai
    Liu, Bo
    Gustafson, Steven
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2019, (306): : 354 - 354
  • [39] A Survey of Data-Efficient Graph Learning
    Ju, Wei
    Yi, Siyu
    Wang, Yifan
    Long, Qingqing
    Luo, Junyu
    Xiao, Zhiping
    Zhang, Ming
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 8104 - 8113
  • [40] Uniform Priors for Data-Efficient Learning
    Sinha, Samarth
    Roth, Karsten
    Goyal, Anirudh
    Ghassemi, Marzyeh
    Akata, Zeynep
    Larochelle, Hugo
    Garg, Animesh
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4026 - 4037