Don't overlook any detail: Data-efficient reinforcement learning with visual attention

被引：0

作者：

Ma, Jialin ^{[1
]}

Li, Ce ^{[1
]}

Feng, Zhiqiang ^{[1
]}

Xiao, Limei ^{[1
]}

He, Chengdan ^{[2
]}

Zhang, Yan ^{[2
]}

机构：

[1] Lanzhou Univ Technol, Sch Elect Engn & Informat Engn, Lanzhou 730050, Peoples R China

[2] Lanzhou Inst Phys, Sci & Technol Vacuum Technol & Phys Lab, Lanzhou 730050, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2025年 / 310卷

基金：

中国国家自然科学基金;

关键词：

Visual reinforcement learning; Visual attention; Don't overlook any detail; Reset; Atari; 100K; MODEL;

D O I：

10.1016/j.knosys.2024.112869

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the widespread application of visual reinforcement learning across various domains, the introduction of visual attention mechanisms aims to emulate human visual tasks, enabling deep models to focus on the crucial parts of images and enhancing model performance. However, in situations with limited data samples, solely introducing visual attention mechanisms can exacerbate overfitting in deep reinforcement learning (DRL), deteriorating performance. Herein, we propose a method called 'Don't overlook any detail (DOAD)' to tackle this issue. A two-step training strategy is proposed to increase the training frequency of the visual attention module while avoiding the specification of explicit tasks and fully acknowledging the pivotal role of visual attention in the learning process, rendering the model more adaptable to environmental changes. Furthermore, a conditional network reset method is proposed to simulate the flexibility observed inhuman learning processes, encouraging the model to adapt more flexibly to new information through regular reset mechanisms without excessively adhering to early knowledge. Finally, extensive experiments were conducted on 26 game environments within the Atari 100K environment. Compared to the baseline with the introduction of visual attention on the interquartile mean (IQM) from 0.44 to 0.37, the introduction of DOAD visual attention methods can improve the IQM to 0.70. DOAD elucidates the internal mechanisms of DRL and offers novel insights for applying visual attention mechanisms in DRL models under limited sample data contexts.

引用

页数：12

共 52 条

[1]

Agarwal N, 2021, ADV NEUR IN

[2]

[Anonymous], 2015, Deep attention recurrent q-network

[3]

Bellemare MG, 2017, PR MACH LEARN RES, V70

[4] The Arcade Learning Environment: An Evaluation Platform for General Agents [J].

Bellemare, Marc G. ;

Naddaf, Yavar ;

Veness, Joel ;

Bowling, Michael .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2013, 47 :253-279

[5]

Beuth F, 2020, IEEE IND ELEC, P5323, DOI [10.1109/iecon43393.2020.9255234, 10.1109/IECON43393.2020.9255234]

[6]

Cagatan OV, 2024, PMLR, P201

[7] Visual attention: The past 25 years [J].

Carrasco, Marisa .

VISION RESEARCH, 2011, 51 (13) :1484-1525

[8]

D'Oro P., 2022, DEEP REINFORCEMENT L

[9] Magnetic control of tokamak plasmas through deep reinforcement learning [J].

Degrave, Jonas ;

Felici, Federico ;

Buchli, Jonas ;

Neunert, Michael ;

Tracey, Brendan ;

Carpanese, Francesco ;

Ewalds, Timo ;

Hafner, Roland ;

Abdolmaleki, Abbas ;

de las Casas, Diego ;

Donner, Craig ;

Fritz, Leslie ;

Galperti, Cristian ;

Huber, Andrea ;

Keeling, James ;

Tsimpoukelli, Maria ;

Kay, Jackie ;

Merle, Antoine ;

Moret, Jean-Marc ;

Noury, Seb ;

Pesamosca, Federico ;

Pfau, David ;

Sauter, Olivier ;

Sommariva, Cristian ;

Coda, Stefano ;

Duval, Basil ;

Fasoli, Ambrogio ;

Kohli, Pushmeet ;

Kavukcuoglu, Koray ;

Hassabis, Demis ;

Riedmiller, Martin .

NATURE, 2022, 602 (7897) :414-+

[10] Object-Goal Visual Navigation via Effective Exploration of Relations among Historical Navigation States [J].

Du, Heming ;

Li, Lincheng ;

Huang, Zi ;

Yu, Xin .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :2563-2573

← 1 2 3 4 5 6 →