Guided Reinforcement Learning via Sequence Learning

被引:0
|
作者
Ramamurthy, Rajkumar [1 ]
Sifa, Rafet [1 ]
Luebbering, Max [1 ]
Bauckhage, Christian [1 ]
机构
[1] Fraunhofer IAIS, St Augustin, Germany
来源
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II | 2020年 / 12397卷
关键词
Reinforcement Learning; Exploration; Novelty Search; Representation learning; Sequence learning;
D O I
10.1007/978-3-030-61616-8_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Applications of Reinforcement Learning (RL) suffer from high sample complexity due to sparse reward signals and inadequate exploration. Novelty Search (NS) guides as an auxiliary task, in this regard to encourage exploration towards unseen behaviors. However, NS suffers from critical drawbacks concerning scalability and generalizability since they are based off instance learning. Addressing these challenges, we previously proposed a generic approach using unsupervised learning to learn representations of agent behaviors and use reconstruction losses as novelty scores. However, it considered only fixed-length sequences and did not utilize sequential information of behaviors. Therefore, we here extend this approach by using sequential auto-encoders to incorporate sequential dependencies. Experimental results on benchmark tasks show that this sequence learning aids exploration outperforming previous novelty search methods.
引用
收藏
页码:335 / 345
页数:11
相关论文
共 50 条
  • [1] Machining sequence learning via inverse reinforcement learning
    Sugisawa, Yasutomo
    Takasugi, Keigo
    Asakawa, Naoki
    PRECISION ENGINEERING-JOURNAL OF THE INTERNATIONAL SOCIETIES FOR PRECISION ENGINEERING AND NANOTECHNOLOGY, 2022, 73 : 477 - 487
  • [2] SEQUENCE-TO-SEQUENCE ASR OPTIMIZATION VIA REINFORCEMENT LEARNING
    Tjandra, Andros
    Sakti, Sakriani
    Nakamura, Satoshi
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5829 - 5833
  • [3] Knowledge-Guided Adaptive Sequence Reinforcement Learning Model
    Li Y.
    Tong X.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (02): : 108 - 119
  • [4] Robust Reinforcement Learning via Progressive Task Sequence
    Li, Yike
    Tian, Yunzhe
    Tong, Endong
    Niu, Wenjia
    Liu, Jiqiang
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 455 - 463
  • [5] Decision Transformer: Reinforcement Learning via Sequence Modeling
    Chen, Lili
    Lu, Kevin
    Rajeswaran, Aravind
    Lee, Kimin
    Grover, Aditya
    Laskin, Michael
    Abbeel, Pieter
    Srinivas, Aravind
    Mordatch, Igor
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Sequence Adaptation via Reinforcement Learning in Recommender Systems
    Antaris, Stefanos
    Rafailidis, Dimitrios
    15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021), 2021, : 714 - 718
  • [7] Optimizing Attention for Sequence Modeling via Reinforcement Learning
    Fei, Hao
    Zhang, Yue
    Ren, Yafeng
    Ji, Donghong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3612 - 3621
  • [8] Sequence labeling via reinforcement learning with aggregate labels
    Geromel, Marcel
    Cimiano, Philipp
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7
  • [9] Attention Guided Imitation Learning and Reinforcement Learning
    Zhang, Ruohan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9906 - 9907
  • [10] Personality-Guided Cloud Pricing via Reinforcement Learning
    Cong, Peijin
    Zhou, Junlong
    Chen, Mingsong
    Wei, Tongquan
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2022, 10 (02) : 925 - 943