GAT TransPruning: progressive channel pruning strategy combining graph attention network and transformer

被引：0

作者：

Lin Y.-C. ^{[1
]}

Wang C.-H. ^{[1
]}

Lin Y.-C. ^{[1
]}

机构：

[1] Department of Automatic Control Engineering, Feng Chia University, Taichung

来源：

PeerJ Computer Science | 2024年 / 10卷

关键词：

Artificial Intelligence; Computer Vision; Edge computing platform; Embedded Computing; Graph attention network; Model compression; Neural Networks; Progressive channel pruning; Self-attention mechanism; Subjects Algorithms and Analysis of Algorithms; Transformer;

D O I：

10.7717/PEERJ-CS.2012

中图分类号：

学科分类号：

摘要：

Recently, large-scale artificial intelligence models with billions of parameters have achieved good results in experiments, but their practical deployment on edge computing platforms is often subject to many constraints because of their resource requirements. These models require powerful computing platforms with a high memory capacity to store and process the numerous parameters and activations, which makes it challenging to deploy these large-scale models directly. Therefore, model compression techniques are crucial role in making these models more practical and accessible. In this article, a progressive channel pruning strategy combining graph attention network and transformer, namely GAT TransPruning, is proposed, which uses the graph attention networks (GAT) and the attention of transformer mechanism to determine the channel-to-channel relationship in large networks. This approach ensures that the network maintains its critical functional connections and optimizes the trade-off between model size and performance. In this study, VGG-16, VGG-19, ResNet-18, ResNet-34, and ResNet-50 are used as large-scale network models with the CIFAR-10 and CIFAR-100 datasets for verification and quantitative analysis of the proposed progressive channel pruning strategy. The experimental results reveal that the accuracy rate only drops by 6.58% when the channel pruning rate is 89% for VGG-19/CIFAR-100. In addition, the lightweight model inference speed is 9.10 times faster than that of the original large model. In comparison with the traditional channel pruning schemes, the proposed progressive channel pruning strategy based on the GAT and Transformer cannot only cut out the insignificant weight channels and effectively reduce the model size, but also ensure that the performance drop rate of its lightweight model is still the smallest even under high pruning ratio. © 2024 Lin et al. Distributed under Creative Commons CC-BY 4.0. All Rights Reserved.

引用

共 48 条

[1]

Ashok A, Rhinehart N, Beainy F, Kitani KM., N2N learning: network to network compression via policy gradient reinforcement learning, International conference on learning representations, pp. 1-21, (2017)

[2]

Bagherinezhad H, Rastegari M, Farhadi A., LCNN: Lookup-based convolutional neural network, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 860-869, (2017)

[3]

Basha S, Farazuddin M, Pulabaigari V, Dubey SR, Mukherjee S., A novel and efficient model pruning method for deep convolutional neural networks by evaluating the direct and indirect effects of filters, Neurocomputing, 573, 7, pp. 1-10, (2024)

[4]

Brock A, Lim T, Ritchie JM, Weston N., Smash: one-shot model architecture search through hypernetworks, International conference on learning representations, pp. 1-22, (2018)

[5]

Chen Z, Xu T-B, Du C, Liu C-L, He H., Dynamical channel pruning by conditional accuracy change for deep neural networks, IEEE Transactions on Neural Networks and Learning Systems, 32, 2, pp. 799-813, (2021)

[6]

Chiliang Z, Tao H, Yingda G, Zuochang Y., Accelerating convolutional neural networks with dynamic channel pruning, Data compression conference, (2019)

[7]

Chollet F., Xception: deep learning with depthwise separable convolutions, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1251-1258, (2017)

[8]

Dong X, Huang J, Yang Y, Yan S., More is Less: a more complicated network with less inference complexity, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1895-1903, (2017)

[9]

Fang G, Ma X, Song M, Mi MB, Wang X., Depgraph: towards any structural pruning, Proceedings of the 2023 IEEE/CVF conference on computer vision and pattern recognition, pp. 16091-16101, (2023)

[10]

Gong Y, Liu L, Yang M, Bourdev L., Compressing deep convolutional networks using vector quantization, (2014)

← 1 2 3 4 5 →