Pre-training with Augmentations for Efficient Transfer in Model-Based Reinforcement Learning

被引：0

作者：

Esteves, Bernardo ^{[1
,2
]}

Vasco, Miguel ^{[1
,2
]}

Melo, Francisco S. ^{[1
,2
]}

机构：

[1] INESC ID, Lisbon, Portugal

[2] Univ Lisbon, Inst Super Tecn, Lisbon, Portugal

来源：

PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I | 2023年 / 14115卷

关键词：

Reinforcement learning; Transfer learning; Representation learning;

D O I：

10.1007/978-3-031-49008-8_11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work explores pre-training as a strategy to allow reinforcement learning (RL) algorithms to efficiently adapt to new (albeit similar) tasks. We argue for introducing variability during the pre-training phase, in the form of augmentations to the observations of the agent, to improve the sample efficiency of the fine-tuning stage. We categorize such variability in the form of perceptual, dynamic and semantic augmentations, which can be easily employed in standard pre-training methods. We perform extensive evaluations of our proposed augmentation scheme in model-based algorithms, across multiple scenarios of increasing complexity. The results consistently show that our augmentation scheme significantly improves the efficiency of the fine-tuning to novel tasks, outperforming other state-of-the-art pre-training approaches.

引用

页码：133 / 145

页数：13

共 50 条

[11] A survey on model-based reinforcement learning
Luo, Fan-Ming
Xu, Tian
Lai, Hang
Chen, Xiong-Hui
Zhang, Weinan
Yu, Yang
SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (02)
[12] AN ADAPTER BASED PRE-TRAINING FOR EFFICIENT AND SCALABLE SELF-SUPERVISED SPEECH REPRESENTATION LEARNING
Kessler, Samuel
Thomas, Bethan
Karout, Salah
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3179 - 3183
[13] CDR-Detector: a chronic disease risk prediction model combining pre-training with deep reinforcement learning
Lin, Shaofu
Zhou, Shiwei
Jiao, Han
Wang, Mengzhen
Yan, Haokang
Dou, Peng
Chen, Jianhui
COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (01)
[14] An Online Reinforcement Learning Method for Multi-Zone Ventilation Control With Pre-Training
Cui, Can
Li, Chunxiao
Li, Ming
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (07) : 7163 - 7172
[15] Geometric-Feature Representation Based Pre-Training Method for Reinforcement Learning of Peg-in-Hole Tasks
Zang, Yajing
Wang, Pengfei
Zha, Fusheng
Guo, Wei
Ruan, Songlin
Sun, Lining
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (06) : 3478 - 3485
[16] Model-based average reward reinforcement learning
Tadepalli, P
Ok, D
ARTIFICIAL INTELLIGENCE, 1998, 100 (1-2) : 177 - 224
[17] Model-Based Reinforcement Learning in Robotics: A Survey
Sun S.
Lan X.
Zhang H.
Zheng N.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2022, 35 (01): : 1 - 16
[18] Asynchronous Methods for Model-Based Reinforcement Learning
Zhang, Yunzhi
Clavera, Ignasi
Tsai, Boren
Abbeel, Pieter
CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
[19] Model gradient: unified model and policy learning in model-based reinforcement learning
Jia, Chengxing
Zhang, Fuxiang
Xu, Tian
Pang, Jing-Cheng
Zhang, Zongzhang
Yu, Yang
FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (04)
[20] Model gradient: unified model and policy learning in model-based reinforcement learning
Chengxing Jia
Fuxiang Zhang
Tian Xu
Jing-Cheng Pang
Zongzhang Zhang
Yang Yu
Frontiers of Computer Science, 2024, 18

← 1 2 3 4 5 →