Efficient feature selection for pre-trained vision transformers

被引:0
作者
Huang, Lan [1 ,2 ]
Zeng, Jia [1 ]
Yu, Mengqiang [1 ]
Ding, Weiping [3 ]
Bai, Xingyu [1 ]
Wang, Kangping [1 ,2 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[3] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China
关键词
Feature selection; Vision transformer; Model pruning;
D O I
10.1016/j.cviu.2025.104326
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Handcrafted layer-wise vision transformers have demonstrated remarkable performance in image classification. However, their high computational cost limits their practical applications. In this paper, we first identify and highlight the data-independent feature redundancy in pre-trained Vision Transformer (ViT) models. Based on this observation, we explore the feasibility of searching for the best substructure within the original pre-trained model. To this end, we propose EffiSelecViT, a novel pruning method aimed at reducing the computational cost of ViTs while preserving their accuracy. EffiSelecViT introduces importance scores for both self-attention heads and Multi-Layer Perceptron (MLP) neurons in pre-trained ViT models. L1 regularization is applied to constrain and learn these scores. In this simple way, components that are crucial for model performance are assigned higher scores, while those with lower scores are identified as less important and subsequently pruned. Experimental results demonstrate that EffiSelecViT can prune DeiT-B to retain only 64% of FLOPs while maintaining accuracy. This efficiency-accuracy trade-off is consistent across various ViT architectures. Furthermore, qualitative analysis reveals enhanced information expression in the pruned models, affirming the effectiveness and practicality of EffiSelecViT. The code is available at https://github.com/ZJ6789/EffiSelecViT.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Classifying microfossil radiolarians on fractal pre-trained vision transformers
    Mimura, Kazuhide
    Itaki, Takuya
    Kataoka, Hirokatsu
    Miyakawa, Ayumu
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [2] ViTMatte: Boosting image matting with pre-trained plain vision transformers
    Yao, Jingfeng
    Wang, Xinggang
    Yang, Shusheng
    Wang, Baoyuan
    INFORMATION FUSION, 2024, 103
  • [3] Interpretable domain adaptation using unsupervised feature selection on pre-trained source models
    Zhang, Luxin
    Germain, Pascal
    Kessaci, Yacine
    Biernacki, Christophe
    NEUROCOMPUTING, 2022, 511 : 319 - 336
  • [4] Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department
    Jiang, Shancheng
    Chin, Kwai-Sang
    Wang, Long
    Qu, Gang
    Tsui, Kwok L.
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 82 : 216 - 230
  • [5] Underwater Image Enhancement Using Pre-trained Transformer
    Boudiaf, Abderrahmene
    Guo, Yuhang
    Ghimire, Adarsh
    Werghi, Naoufel
    De Masi, Giulia
    Javed, Sajid
    Dias, Jorge
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT III, 2022, 13233 : 480 - 488
  • [6] Token Selection is a Simple Booster for Vision Transformers
    Zhou, Daquan
    Hou, Qibin
    Yang, Linjie
    Jin, Xiaojie
    Feng, Jiashi
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 12738 - 12746
  • [7] AttnZero: Efficient Attention Discovery for Vision Transformers
    Li, Lujun
    Wei, Zimian
    Dong, Peijie
    Luo, Wenhan
    Xue, Wei
    Liu, Qifeng
    Guo, Yike
    COMPUTER VISION - ECCV 2024, PT V, 2025, 15063 : 20 - 37
  • [8] An optimal deep learning approach for breast cancer detection and classification with pre-trained CNN-based feature learning mechanism
    Meena, L. C.
    Joe Prathap, P. M.
    JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2024,
  • [9] Target to Source Coordinate-Wise Adaptation of Pre-trained Models
    Zhang, Luxin
    Germain, Pascal
    Kessaci, Yacine
    Biernacki, Christophe
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT I, 2021, 12457 : 378 - 394
  • [10] Towards Efficient Adversarial Training on Vision Transformers
    Wu, Boxi
    Gu, Jindong
    Li, Zhifeng
    Cai, Deng
    He, Xiaofei
    Liu, Wei
    COMPUTER VISION, ECCV 2022, PT XIII, 2022, 13673 : 307 - 325