MG-ViT: A Multi-Granularity Method for Compact and Efficient Vision Transformers

被引:0
|
作者
Zhang, Yu [1 ]
Liu, Yepeng [2 ]
Miao, Duoqian [1 ]
Zhang, Qi [1 ]
Shi, Yiwei [3 ]
Hu, Liang [1 ]
机构
[1] Tongji Univ, Shanghai, Peoples R China
[2] Univ Florida, Gainesville, FL 32611 USA
[3] Univ Bristol, Bristol BS81TH, Avon, England
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vision Transformer (ViT) faces obstacles in wide application due to its huge computational cost. Almost all existing studies on compressing ViT adopt the manner of splitting an image with a single granularity, with very few exploration of splitting an image with multi-granularity. As we know, important information often randomly concentrate in few regions of an image, necessitating multi-granularity attention allocation to an image. Enlightened by this, we introduce the multi-granularity strategy to compress ViT, which is simple but effective. We propose a two-stage multi-granularity framework, MG-ViT, to balance ViT's performance and computational cost. In single-granularity inference stage, an input image is split into a small number of patches for simple inference. If necessary, multi-granularity inference stage will be instigated, where the important patches are further subsplit into multi-finer-grained patches for subsequent inference. Moreover, prior studies on compression only for classification, while we extend the multi-granularity strategy to hierarchical ViT for downstream tasks such as detection and segmentation. Extensive experiments Prove the effectiveness of the multi-granularity strategy. For instance, on ImageNet, without any loss of performance, MG-ViT reduces 47% FLOPs of LV-ViT-S and 56% FLOPs of DeiT-S.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Multi-granularity onboard decision method for optical space surveillance satellite
    Sun, Y.J.
    Dong, Y.F.
    Aeronautical Journal, 2024,
  • [42] A Multi-Granularity Backbone Network Extraction Method Based on the Topology Potential
    Yuan, Hanning
    Han, Yanni
    Cai, Ning
    An, Wei
    COMPLEXITY, 2018,
  • [43] Hugs Bring Double Benefits: Unsupervised Cross-Modal Hashing with Multi-granularity Aligned Transformers
    Wang, Jinpeng
    Zeng, Ziyun
    Chen, Bin
    Wang, Yuting
    Liao, Dongliang
    Li, Gongfu
    Wang, Yiru
    Xia, Shu-Tao
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (08) : 2765 - 2797
  • [44] Research on Method of Tree Structure Selecting Optimal Medication Granularity Based on of Multi-granularity Decision System
    Xue, Yu
    Shen, Xiajiong
    Zhang, Lei
    Han, Daojun
    Qi, Tongyuan
    14TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND EDUCATION (ICCSE 2019), 2019, : 601 - 605
  • [45] A Decision-Making Method Combining Multi-granularity Rough Set and VIKOR Method
    Han, Yuzhen
    Xie, Li
    Ye, Jun
    Lu, Lan
    Wang, Hui
    NEURAL COMPUTING FOR ADVANCED APPLICATIONS, NCAA 2024, PT II, 2025, 2182 : 148 - 162
  • [46] TEACHER-STUDENT LEARNING WITH MULTI-GRANULARITY CONSTRAINT TOWARDS COMPACT FACIAL FEATURE REPRESENTATION
    Wang, Shurun
    Wang, Shiqi
    Yang, Wenhan
    Zhang, Xinfeng
    Wang, Shanshe
    Ma, Siwei
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8503 - 8507
  • [47] PEANO-ViT: Power-Efficient Approximations of Non-Linearities in Vision Transformers
    Sadeghi, Mohammad Erfan
    Fayyazi, Arash
    Azizi, Seyedarmin
    Pedram, Massoud
    PROCEEDINGS OF THE 29TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, ISLPED 2024, 2024,
  • [48] UMG-CLIP: A Unified Multi-granularity Vision Generalist for Open-World Understanding
    Shi, Bowen
    Zhao, Peisen
    Wang, Zichen
    Zhang, Yuhang
    Wang, Yaoming
    Li, Jin
    Dai, Wenrui
    Zou, Junni
    Xiong, Hongkai
    Tian, Qi
    Zhang, Xiaopeng
    COMPUTER VISION-ECCV 2024, PT XXXVIII, 2025, 15096 : 259 - 277
  • [49] A novel complex network prediction method based on multi-granularity contrastive learning
    Sui, Shanshan
    Han, Qilong
    Lu, Dan
    Wu, Shiqing
    Xu, Guandong
    CCF TRANSACTIONS ON PERVASIVE COMPUTING AND INTERACTION, 2024,
  • [50] DCNV: a network visualization method for dynamic community detection based on multi-granularity
    Shen, Qiaomu
    Zhu, Min
    He, Qinglai
    Sun, Yu
    Li, Mengying
    Journal of Computational Information Systems, 2014, 10 (24): : 10581 - 10591