Understanding The Robustness in Vision Transformers

被引:0
|
作者
Zhou, Daquan [1 ,2 ]
Yu, Zhiding [2 ]
Xie, Enze [3 ]
Xiao, Chaowei [2 ,4 ]
Anandkumar, Anima [2 ,5 ]
Feng, Jiashi [1 ,6 ]
Alvarez, Jose M. [2 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] NVIDIA, Santa Clara, CA 95050 USA
[3] Univ Hong Kong, Hong Kong, Peoples R China
[4] ASU, Tempe, AZ USA
[5] CALTECH, Pasadena, CA 91125 USA
[6] ByteDance, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies show that Vision Transformers (ViTs) exhibit strong robustness against various corruptions. Although this property is partly attributed to the self-attention mechanism, there is still a lack of systematic understanding. In this paper, we examine the role of self-attention in learning robust representations. Our study is motivated by the intriguing properties of the emerging visual grouping in Vision Transformers, which indicates that self-attention may promote robustness through improved mid-level representations. We further propose a family of fully attentional networks (FANs) that strengthen this capability by incorporating an attentional channel processing design. We validate the design comprehensively on various hierarchical backbones. Our model achieves a state-of-the-art 87.1% accuracy and 35.8% mCE on ImageNet-1k and ImageNet-C with 76.8M parameters. We also demonstrate state-of-the-art accuracy and robustness in two downstream tasks: semantic segmentation and object detection. Code will be available at https://github.com/NVlabs/FAN.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Understanding Adversarial Robustness of Vision Transformers via Cauchy Problem
    Wang, Zheng
    Ruan, Wenjie
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT III, 2023, 13715 : 562 - 577
  • [2] ODE4ViTRobustness: A tool for understanding adversarial robustness of Vision Transformers
    Wang, Zheng
    Ruan, Wenjie
    Yin, Xiangyu
    SOFTWARE IMPACTS, 2023, 15
  • [3] On the Robustness of Vision Transformers to Adversarial Examples
    Mahmood, Kaleel
    Mahmood, Rigel
    van Dijk, Marten
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7818 - 7827
  • [4] Understanding and Improving Robustness of Vision Transformers through Patch-based Negative Augmentation
    Qin, Yao
    Zhang, Chiyuan
    Chen, Ting
    Lakshminarayanan, Balaji
    Beutel, Alex
    Wang, Xuezhi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [5] Understanding Robustness of Transformers for Image Classification
    Bhojanapalli, Srinadh
    Chakrabarti, Ayan
    Glasner, Daniel
    Li, Daliang
    Unterthiner, Thomas
    Veit, Andreas
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10211 - 10221
  • [6] Certified Patch Robustness via Smoothed Vision Transformers
    Salman, Hadi
    Jain, Saachi
    Wong, Eric
    Madry, Aleksander
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15116 - 15126
  • [7] Harnessing Edge Information for Improved Robustness in Vision Transformers
    Li, Yanxi
    Du, Chengbin
    Xu, Chang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3252 - 3260
  • [8] Optimizing Relevance Maps of Vision Transformers Improves Robustness
    Chefer, Hila
    Schwartz, Idan
    Wolf, Lior
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [9] Adversarial Robustness of Vision Transformers Versus Convolutional Neural Networks
    Ali, Kazim
    Bhatti, Muhammad Shahid
    Saeed, Atif
    Athar, Atifa
    Al Ghamdi, Mohammed A.
    Almotiri, Sultan H.
    Akram, Samina
    IEEE ACCESS, 2024, 12 : 105281 - 105293
  • [10] Trade-off between Robustness and Accuracy of Vision Transformers
    Li, Yanxi
    Xu, Chang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7558 - 7568