Efficient feature selection for pre-trained vision transformers

被引:2
作者
Huang, Lan [1 ,2 ]
Zeng, Jia [1 ]
Yu, Mengqiang [1 ]
Ding, Weiping [3 ]
Bai, Xingyu [1 ]
Wang, Kangping [1 ,2 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[3] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China
关键词
Feature selection; Vision transformer; Model pruning;
D O I
10.1016/j.cviu.2025.104326
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Handcrafted layer-wise vision transformers have demonstrated remarkable performance in image classification. However, their high computational cost limits their practical applications. In this paper, we first identify and highlight the data-independent feature redundancy in pre-trained Vision Transformer (ViT) models. Based on this observation, we explore the feasibility of searching for the best substructure within the original pre-trained model. To this end, we propose EffiSelecViT, a novel pruning method aimed at reducing the computational cost of ViTs while preserving their accuracy. EffiSelecViT introduces importance scores for both self-attention heads and Multi-Layer Perceptron (MLP) neurons in pre-trained ViT models. L1 regularization is applied to constrain and learn these scores. In this simple way, components that are crucial for model performance are assigned higher scores, while those with lower scores are identified as less important and subsequently pruned. Experimental results demonstrate that EffiSelecViT can prune DeiT-B to retain only 64% of FLOPs while maintaining accuracy. This efficiency-accuracy trade-off is consistent across various ViT architectures. Furthermore, qualitative analysis reveals enhanced information expression in the pruned models, affirming the effectiveness and practicality of EffiSelecViT. The code is available at https://github.com/ZJ6789/EffiSelecViT.
引用
收藏
页数:10
相关论文
共 50 条
[41]   Efficient Online Learning for Multitask Feature Selection [J].
Yang, Haiqin ;
Lyu, Michael R. ;
King, Irwin .
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2013, 7 (02)
[42]   Efficient greedy feature selection for unsupervised learning [J].
Ahmed K. Farahat ;
Ali Ghodsi ;
Mohamed S. Kamel .
Knowledge and Information Systems, 2013, 35 :285-310
[43]   An Efficient Feature Selection Method for Activity Classification [J].
Zhang, Shumei ;
McCullagh, Paul ;
Callaghan, Vic .
2014 INTERNATIONAL CONFERENCE ON INTELLIGENT ENVIRONMENTS (IE), 2014, :16-22
[44]   Novel and Efficient Randomized Algorithms for Feature Selection [J].
Wang, Zigeng ;
Xiao, Xia ;
Rajasekaran, Sanguthevar .
BIG DATA MINING AND ANALYTICS, 2020, 3 (03) :208-224
[45]   Efficient greedy feature selection for unsupervised learning [J].
Farahat, Ahmed K. ;
Ghodsi, Ali ;
Kamel, Mohamed S. .
KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 35 (02) :285-310
[46]   AN EFFICIENT FEATURE SELECTION METHOD FOR SPEAKER RECOGNITION [J].
Sun, Hanwu ;
Ma, Bin ;
Li, Haizhou .
2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, :181-184
[47]   Efficient feature selection using shrinkage estimators [J].
Sechidis, Konstantinos ;
Azzimonti, Laura ;
Pocock, Adam ;
Corani, Giorgio ;
Weatherall, James ;
Brown, Gavin .
MACHINE LEARNING, 2019, 108 (8-9) :1261-1286
[48]   Efficient feature selection techniques for sentiment analysis [J].
Madasu, Avinash ;
Elango, Sivasankar .
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (9-10) :6313-6335
[49]   Efficient feature selection using shrinkage estimators [J].
Konstantinos Sechidis ;
Laura Azzimonti ;
Adam Pocock ;
Giorgio Corani ;
James Weatherall ;
Gavin Brown .
Machine Learning, 2019, 108 :1261-1286
[50]   An Efficient Fuzzy Rough Approach for Feature Selection [J].
Xu, Feifei ;
Pan, Weiguo ;
Wei, Lai ;
Du, Haizhou .
ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2011, 6954 :95-+