Token singularity understanding and removal for transformers

被引:0
|
作者
Wang, Dan [1 ]
Jiao, Licheng [1 ]
Zhang, Ruohan [1 ]
Yang, Shuyuan [1 ]
Liu, Fang [1 ]
机构
[1] Xidian Univ, Sch Artificial Intelligence, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Vision transformer; Global attention mechanism; Singularities in neural network; Robustness; Dual-tree complex wavelet transform; DYNAMICS; WAVELETS;
D O I
10.1016/j.knosys.2024.111718
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work delves into unveiling the singularity issue latent in global attention -based Transformers. Empirical and theoretical analyses elucidate that interrelationships among token channels lead to singularities, impeding the training of attention weights. Concretely, the similar neighbor pixels within image patches can form intercorrelated channels after being flattened. Images that one color dominates can possess correlated channels. Furthermore, the fixed global connection architecture retains correlation relationships, contributing to the persistence of singularities. High singularity risks reducing Transformers' performance and robustness. Based on the singularity analysis, we propose the Token Singularity Removal (TSR) strategy. It incorporates the DualTree Complex Wavelet Transform (DTCWT) stem and Feature Decorrelation (FD) loss, aiming to encourage Transformers to learn tokens with unrelated channels and eliminate singularities. Experimental validation across various image classification datasets and corruption image data sets demonstrate improved accuracy and robustness of Transformers utilizing the TSR strategy. Our code is publicly available at https://github. com/wdanc/TSR.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Learned Token Pruning for Transformers
    Kim, Sehoon
    Shen, Sheng
    Thorsley, David
    Gholami, Amir
    Kwon, Woosuk
    Hassoun, Joseph
    Keutzer, Kurt
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 784 - 794
  • [2] Multimodal Token Fusion for Vision Transformers
    Wang, Yikai
    Chen, Xinghao
    Cao, Lele
    Huang, Wenbing
    Sun, Fuchun
    Wang, Yunhe
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12176 - 12185
  • [3] Efficient Transformers with Dynamic Token Pooling
    Nawrot, Piotr
    Chorowski, Jan
    Lancucki, Adrian
    Ponti, Edoardo M.
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 6403 - 6417
  • [4] Robustifying Token Attention for Vision Transformers
    Guo, Yong
    Stutz, David
    Schiele, Bernt
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17511 - 17522
  • [5] Token Pooling in Vision Transformers for Image Classification
    Marin, Dmitrii
    Chang, Jen-Hao Rick
    Ranjan, Anurag
    Prabhu, Anish
    Rastegari, Mohammad
    Tuzel, Oncel
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 12 - 21
  • [6] Token Selection is a Simple Booster for Vision Transformers
    Zhou, Daquan
    Hou, Qibin
    Yang, Linjie
    Jin, Xiaojie
    Feng, Jiashi
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 12738 - 12746
  • [7] Token-Label Alignment for Vision Transformers
    Xiao, Han
    Zheng, Wenzhao
    Zhu, Zheng
    Zhou, Jie
    Lu, Jiwen
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5472 - 5481
  • [8] TLM: Token-Level Masking for Transformers
    Wu, Yangjun
    Fang, Kebin
    Zhang, Dongxiang
    Wang, Han
    Zhang, Hao
    Chen, Gang
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14099 - 14111
  • [9] Adaptive Token Sampling for Efficient Vision Transformers
    Fayyaz, Mohsen
    Koohpayegani, Soroush Abbasi
    Jafari, Farnoush Rezaei
    Sengupta, Sunando
    Joze, Hamid Reza Vaezi
    Sommerlade, Eric
    Pirsiavash, Hamed
    Gall, Juergen
    COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 396 - 414
  • [10] Understanding transformers
    Hickman, I
    ELECTRONICS WORLD, 2001, 107 (1788): : 934 - 936