MG-ViT: A Multi-Granularity Method for Compact and Efficient Vision Transformers

被引:0
|
作者
Zhang, Yu [1 ]
Liu, Yepeng [2 ]
Miao, Duoqian [1 ]
Zhang, Qi [1 ]
Shi, Yiwei [3 ]
Hu, Liang [1 ]
机构
[1] Tongji Univ, Shanghai, Peoples R China
[2] Univ Florida, Gainesville, FL 32611 USA
[3] Univ Bristol, Bristol BS81TH, Avon, England
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vision Transformer (ViT) faces obstacles in wide application due to its huge computational cost. Almost all existing studies on compressing ViT adopt the manner of splitting an image with a single granularity, with very few exploration of splitting an image with multi-granularity. As we know, important information often randomly concentrate in few regions of an image, necessitating multi-granularity attention allocation to an image. Enlightened by this, we introduce the multi-granularity strategy to compress ViT, which is simple but effective. We propose a two-stage multi-granularity framework, MG-ViT, to balance ViT's performance and computational cost. In single-granularity inference stage, an input image is split into a small number of patches for simple inference. If necessary, multi-granularity inference stage will be instigated, where the important patches are further subsplit into multi-finer-grained patches for subsequent inference. Moreover, prior studies on compression only for classification, while we extend the multi-granularity strategy to hierarchical ViT for downstream tasks such as detection and segmentation. Extensive experiments Prove the effectiveness of the multi-granularity strategy. For instance, on ImageNet, without any loss of performance, MG-ViT reduces 47% FLOPs of LV-ViT-S and 56% FLOPs of DeiT-S.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] An Efficient Scalable Multi-granularity HEVC Encoder Based on Embedded System
    Sun, Shiming
    Jiang, Hongxu
    Liu, Tingshan
    Li, Bo
    2015 8TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP), 2015, : 85 - 91
  • [32] A Multi-Granularity FPGA With Hierarchical Interconnects for Efficient and Flexible Mobile Computing
    Yuan, Fang-Li
    Wang, Cheng C.
    Yu, Tsung-Han
    Markovic, Dejan
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2015, 50 (01) : 137 - 149
  • [33] Granular-ball computing: an efficient, robust, and interpretable adaptive multi-granularity representation and computation method
    Xia, Shuyin
    Wang, Guoyin
    Gao, Xinbo
    Lian, Xiaoyu
    arXiv, 2023,
  • [34] Efficient Adapting for Vision-language Foundation Model in Edge Computing Based on Personalized and Multi-Granularity Federated Learning
    Gao, Fei
    Zhao, Yunfeng
    Qiu, Chao
    Wang, Xiaofei
    IEEE INFOCOM 2024-IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS, INFOCOM WKSHPS 2024, 2024,
  • [35] Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation
    Chen, Peihao
    Ji, Dongyu
    Lin, Kunyang
    Zeng, Runhao
    Li, Thomas H.
    Tan, Mingkui
    Gan, Chuang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [36] Multi-Granularity Modeling Method for Effectiveness Evaluation of Remote Sensing Satellites
    Lei, Ming
    Dong, Yunfeng
    REMOTE SENSING, 2023, 15 (17)
  • [37] Multi-granularity onboard decision method for optical space surveillance satellite
    Sun, Y. J.
    Dong, Y. F.
    AERONAUTICAL JOURNAL, 2024,
  • [38] Multi-granularity Grey Incidence Measurement Method to Data Distribution Sequene
    Dai, Jin
    Wang, Mei
    Liu, Huijie
    Sun, Yannan
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2017), 2017, : 87 - 92
  • [39] Research on Multi-Granularity Neural Network Pruning Method with Regularization Mechanism
    Liu Q.
    Chen Y.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (08): : 2202 - 2212
  • [40] A method for group decision making with multi-granularity linguistic assessment information
    Jiang, Yan-Ping
    Fan, Zhi-Ping
    Ma, Jian
    INFORMATION SCIENCES, 2008, 178 (04) : 1098 - 1109