Don't overlook any detail: Data-efficient reinforcement learning with visual attention

被引：1

作者：

Ma, Jialin ^{[1
]}

Li, Ce ^{[1
]}

Feng, Zhiqiang ^{[1
]}

Xiao, Limei ^{[1
]}

He, Chengdan ^{[2
]}

Zhang, Yan ^{[2
]}

机构：

[1] Lanzhou Univ Technol, Sch Elect Engn & Informat Engn, Lanzhou 730050, Peoples R China

[2] Lanzhou Inst Phys, Sci & Technol Vacuum Technol & Phys Lab, Lanzhou 730050, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2025年 / 310卷

基金：

中国国家自然科学基金;

关键词：

Visual reinforcement learning; Visual attention; Don't overlook any detail; Reset; Atari; 100K; MODEL;

D O I：

10.1016/j.knosys.2024.112869

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the widespread application of visual reinforcement learning across various domains, the introduction of visual attention mechanisms aims to emulate human visual tasks, enabling deep models to focus on the crucial parts of images and enhancing model performance. However, in situations with limited data samples, solely introducing visual attention mechanisms can exacerbate overfitting in deep reinforcement learning (DRL), deteriorating performance. Herein, we propose a method called 'Don't overlook any detail (DOAD)' to tackle this issue. A two-step training strategy is proposed to increase the training frequency of the visual attention module while avoiding the specification of explicit tasks and fully acknowledging the pivotal role of visual attention in the learning process, rendering the model more adaptable to environmental changes. Furthermore, a conditional network reset method is proposed to simulate the flexibility observed inhuman learning processes, encouraging the model to adapt more flexibly to new information through regular reset mechanisms without excessively adhering to early knowledge. Finally, extensive experiments were conducted on 26 game environments within the Atari 100K environment. Compared to the baseline with the introduction of visual attention on the interquartile mean (IQM) from 0.44 to 0.37, the introduction of DOAD visual attention methods can improve the IQM to 0.70. DOAD elucidates the internal mechanisms of DRL and offers novel insights for applying visual attention mechanisms in DRL models under limited sample data contexts.

引用

页数：12

共 52 条

[11] An Approach to Partial Observability in Games: Learning to Both Act and Observe [J].

Gilmour, Elizabeth ;

Plotkin, Noah ;

Smith, Leslie N. .

2021 IEEE CONFERENCE ON GAMES (COG), 2021, :971-975

[12] Deep attention models with dimension-reduction and gate mechanisms for solving practical time-dependent vehicle routing problems [J].

Guo, Feng ;

Wei, Qu ;

Wang, Miao ;

Guo, Zhaoxia ;

Wallace, Stein W. .

TRANSPORTATION RESEARCH PART E-LOGISTICS AND TRANSPORTATION REVIEW, 2023, 173

[13]

Hessel M, 2018, AAAI CONF ARTIF INTE, P3215

[14] Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning [J].

Itaya, Hidenori ;

Hirakawa, Tsubasa ;

Yamashita, Takayoshi ;

Fujiyoshi, Hironobu ;

Sugiura, Komei .

2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,

[15] A model of saliency-based visual attention for rapid scene analysis [J].

Itti, L ;

Koch, C ;

Niebur, E .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (11) :1254-1259

[16] Deep learning approaches for visual faults diagnosis of photovoltaic systems: State-of-the-Art review [J].

Jalal, Marium ;

Khalil, Ihsan Ullah ;

ul Haq, Azhar .

RESULTS IN ENGINEERING, 2024, 23

[17] Transferring policy of deep reinforcement learning from simulation to reality for robotics [J].

Ju, Hao ;

Juan, Rongshun ;

Gomez, Randy ;

Nakamura, Keisuke ;

Li, Guangliang .

NATURE MACHINE INTELLIGENCE, 2022, 4 (12) :1077-1087

[18]

Kaiser U., 2019, INT C LEARN REPR

[19] Champion-level drone racing using deep reinforcement learning [J].

Kaufmann, Elia ;

Bauersfeld, Leonard ;

Loquercio, Antonio ;

Mueller, Matthias ;

Koltun, Vladlen ;

Scaramuzza, Davide .

NATURE, 2023, 620 (7976) :982-+

[20]

Kostrikov I, 2021, Arxiv, DOI [arXiv:2004.13649, 10.48550/arXiv.2004.13649]

← 1 2 3 4 5 6 →