CNNs with Multi-Level Attention for Domain Generalization

被引：4

作者：

Ballas, Aristotelis ^{[1
]}

Diou, Cristos ^{[1
]}

机构：

[1] Harokopio Univ Athens, Athens, Greece

来源：

PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023 | 2023年

关键词：

domain generalization; representation learning; visual attention; deep learning;

D O I：

10.1145/3591106.3592263

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the past decade, deep convolutional neural networks have achieved significant success in image classification and ranking, finding therefore numerous applications in multimedia content retrieval. Still, these models suffer from performance degradation when neural networks are tested on out-of-distribution scenarios or on data originating from previously unseen data Domains. In the present work, we focus on this problem of Domain Generalization and propose an alternative neural network architecture for robust, outof-distribution image classification. We attempt to produce a model that focuses on the causal features of the depicted class for robust image classification in the Domain Generalization setting. To achieve this, we propose attending to multiple-levels of information throughout a Convolutional Neural Network and leveraging the most important attributes of an image, by employing trainable attention mechanisms. To validate our method we evaluate our model on four widely accepted Domain Generalization benchmarks, where our model is able to surpass previously reported baselines in three out of four datasets and achieve the second best score in the fourth one.

引用

页码：592 / 596

页数：5

共 50 条

[1]

[Anonymous], 1995, Springer, DOI DOI 10.1007/978-1-4757-2440-0

[2]

Arjovsky M, 2020, Arxiv, DOI [arXiv:1907.02893, DOI 10.48550/ARXIV.1907.02893, 10.48550/arXiv.1907.02893]

[3] Recognition in Terra Incognita [J].

Beery, Sara ;

Van Horn, Grant ;

Perona, Pietro .

COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 :472-489

[4] Representation Learning: A Review and New Perspectives [J].

Bengio, Yoshua ;

Courville, Aaron ;

Vincent, Pascal .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828

[5] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[6] Domain Generalization by Solving Jigsaw Puzzles [J].

Carlucci, Fabio M. ;

D'Innocente, Antonio ;

Bucci, Silvia ;

Caputo, Barbara ;

Tommasi, Tatiana .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2224-2233

[7] All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification [J].

Chen, Weijie ;

Xie, Di ;

Zhang, Yuan ;

Pu, Shiliang .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7234-7243

[8] Large-Scale Concept Detection in Multimedia Data Using Small Training Sets and Cross-Domain Concept Fusion [J].

Diou, Christos ;

Stephanopoulos, George ;

Panagiotopoulos, Panagiotis ;

Papachristou, Christos ;

Dimitriou, Nikos ;

Delopoulos, Anastasios .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2010, 20 (12) :1808-1821

[9]

Dosovitskiy A., 2021, INT C LEARNING REPRE, P1

[10] Learning to Learn with Variational Information Bottleneck for Domain Generalization [J].

Du, Yingjun ;

Xu, Jun ;

Xiong, Huan ;

Qiu, Qiang ;

Zhen, Xiantong ;

Snoek, Cees G. M. ;

Shao, Ling .

COMPUTER VISION - ECCV 2020, PT X, 2020, 12355 :200-216

← 1 2 3 4 5 →