Significance extraction based on data augmentation for reinforcement learning

被引：0

作者：

Han, Yuxi ^{[1
]}

Li, Dequan ^{[1
]}

Yang, Yang ^{[1
]}

机构：

[1] Anhui Univ Sci & Technol, Fac Artificial Intelligence, Huainan 232000, Peoples R China

来源：

FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING | 2025年 / 26卷 / 03期

关键词：

Deep reinforcement learning; Visual tasks; Generalization; Data augmentation; Significance; DeepMind Control generalization benchmark; TP391.4;

D O I：

10.1631/FITEE.2400406

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep reinforcement learning has shown remarkable capabilities in visual tasks, but it does not have a good generalization ability in the context of interference signals in the input images; this approach is therefore hard to be applied to trained agents in a new environment. To enable agents to distinguish between noise signals and important pixels in images, data augmentation techniques and the establishment of auxiliary networks are proven effective solutions. We introduce a novel algorithm, namely, saliency-extracted Q-value by augmentation (SEQA), which encourages the agent to explore unknown states more comprehensively and focus its attention on important information. Specifically, SEQA masks out interfering features and extracts salient features and then updates the mask decoder network with critic losses to encourage the agent to focus on important features and make correct decisions. We evaluate our algorithm on the DeepMind Control generalization benchmark (DMControl-GB), and the experimental results show that our algorithm greatly improves training efficiency and stability. Meanwhile, our algorithm is superior to state-of-the-art reinforcement learning methods in terms of sample efficiency and generalization in most DMControl-GB tasks.

引用

页码：385 / 399

页数：15

共 51 条

[1]

Akkaya I., 2019, Solving rubik's cube with a robot hand

[2]

Almuzairee A., 2024, RECIPE UNBOUNDED DAT

[3] Adversarial Imitation Learning with Trajectorial Augmentation and Correction [J].

Antotsiou, Dafni ;

Ciliberto, Carlo ;

Kim, Tae-Kyun .

2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, :4724-4730

[4] Deep Reinforcement Learning A brief survey [J].

Arulkumaran, Kai ;

Deisenroth, Marc Peter ;

Brundage, Miles ;

Bharath, Anil Anthony .

IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) :26-38

[5]

Bertoin D, 2022, ADV NEUR IN

[6]

Chen T., 2020, INT C MACH LEARN PML, P1597, DOI [10.48550/arXiv.2002.05709, DOI 10.48550/ARXIV.2002.05709]

[7]

Cobbe K, 2019, PR MACH LEARN RES, V97

[8]

Farebrother J., 2018, GENERALIZATION REGUL

[9]

Fu X., 2021, LEARNING PMLR INT C, P3480

[10]

Gamrian S, 2019, PR MACH LEARN RES, V97

← 1 2 3 4 5 6 →