ATTENTION PROBE: VISION TRANSFORMER DISTILLATION IN THE WILD

被引:1
|
作者
Wang, Jiahao [1 ]
Cao, Mingdeng [1 ]
Shi, Shuwei [1 ]
Wu, Baoyuan [2 ]
Yang, Yujiu [1 ]
机构
[1] Tsinghua Shenzhen Int Grad Sch, Shenzhen, Peoples R China
[2] Chinese Univ Hong Kong, Shenzhen, Peoples R China
来源
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年
基金
中国国家自然科学基金;
关键词
Transformer; data-free; distillation;
D O I
10.1109/ICASSP43922.2022.9747484
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Vision transformers (ViTs) require intensive computational resources to achieve high performance, which usually makes them not suitable for mobile devices. A feasible strategy is to compress them using the original training data, which may be not accessible due to privacy limitations or transmission restrictions. In this case, utilizing the massive unlabeled data in the wild is an alternative paradigm, which has been proved effective for compressing convolutional neural networks (CNNs). However, due to the significant differences in model structure and computation mechanism between CNNs and ViTs, it is still an open issue that whether the similar paradigm is suitable for ViTs. In this work, we propose to effectively compress ViTs using the unlabeled data in the wild, consisting of two stages. First, we design an effective tool in selecting valuable data from the wild, dubbed Attention Probe. Second, based on the selected data, we develop a probe knowledge distillation algorithm to train a lightweight student transformer, through maximizing the similarities on both the outputs and intermediate features, between the heavy teacher and the lightweight student models. Extensive experimental results on several benchmarks demonstrate that the student transformer obtained by the proposed method can achieve comparable performance with the baseline that requires the original training data. Code is available at: https://github.com/IIGROUP/AttentionProbe.
引用
收藏
页码:2220 / 2224
页数:5
相关论文
共 50 条
  • [1] Vision Transformer Based on Reconfigurable Gaussian Self-attention
    Zhao L.
    Zhou J.-K.
    Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (09): : 1976 - 1988
  • [2] Abnormality Detection of Blast Furnace Tuyere Based on Knowledge Distillation and a Vision Transformer
    Song, Chuanwang
    Zhang, Hao
    Wang, Yuanjun
    Wang, Yuhui
    Hu, Keyong
    APPLIED SCIENCES-BASEL, 2023, 13 (18):
  • [3] Script Identification in the Wild with FFT-Multi-grained Mix Attention Transformer
    Pan, Zhi
    Yang, Yaowei
    Ubul, Kurban
    Aysa, Alimjan
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT II, 2024, 14805 : 104 - 117
  • [4] k-NN attention-based video vision transformer for action recognition
    Sun, Weirong
    Ma, Yujun
    Wang, Ruili
    NEUROCOMPUTING, 2024, 574
  • [5] Worker behavior recognition based on temporal and spatial self-attention of vision Transformer
    Lu Y.-X.
    Xu G.-H.
    Tang B.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (03): : 446 - 454
  • [6] Simultaneous Segmentation and Classification of Esophageal Lesions Using Attention Gating Pyramid Vision Transformer
    Ge, Peixuan
    Yan, Tao
    Wong, Pak Kin
    Li, Zheng
    Chan, In Neng
    Yu, Hon Ho
    Chan, Chon In
    Yao, Liang
    Hu, Ying
    Gao, Shan
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, : 1961 - 1975
  • [7] PLG-ViT: Vision Transformer with Parallel Local and Global Self-Attention
    Ebert, Nikolas
    Stricker, Didier
    Wasenmueller, Oliver
    SENSORS, 2023, 23 (07)
  • [8] Vision Transformer for Pansharpening
    Meng, Xiangchao
    Wang, Nan
    Shao, Feng
    Li, Shutao
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [9] A Survey on Vision Transformer
    Han, Kai
    Wang, Yunhe
    Chen, Hanting
    Chen, Xinghao
    Guo, Jianyuan
    Liu, Zhenhua
    Tang, Yehui
    Xiao, An
    Xu, Chunjing
    Xu, Yixing
    Yang, Zhaohui
    Zhang, Yiman
    Tao, Dacheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 87 - 110
  • [10] An illumination-guided dual attention vision transformer for low-light image enhancement
    Wen, Yanjie
    Xu, Ping
    Li, Zhihong
    Xu, Wangtu
    PATTERN RECOGNITION, 2025, 158