The Possibility of Combining and Implementing Deep Neural Network Compression Methods

被引：12

作者：

Predic, Bratislav ^{[1
]}

Vukic, Uros ^{[1
]}

Saracevic, Muzafer ^{[2
]}

Karabasevic, Darjan ^{[3
]}

Stanujkic, Dragisa ^{[4
]}

机构：

[1] Univ Nis, Fac Elect Engn, Aleksandra Medvedeva 14, Nish 18000, Serbia

[2] Univ Novi Pazar, Dept Comp Sci, Novi Pazar 36300, Serbia

[3] Univ Business Acad Novi Sad, Fac Appl Management Econ & Finance, Jevrejska 24, Belgrade 11000, Serbia

[4] Univ Belgrade, Tech Fac Bor, Vojske Jugoslavije 12, Bor 19210, Serbia

来源：

AXIOMS | 2022年 / 11卷 / 05期

关键词：

deep learning; convolutional neural networks; deep neural network model; combining methods; implementation; SYSTEM; IOT;

D O I：

10.3390/axioms11050229

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

In the paper, the possibility of combining deep neural network (DNN) model compression methods to achieve better compression results was considered. To compare the advantages and disadvantages of each method, all methods were applied to the ResNet18 model for pretraining to the NCT-CRC-HE-100K dataset while using CRC-VAL-HE-7K as the validation dataset. In the proposed method, quantization, pruning, weight clustering, QAT (quantization-aware training), preserve cluster QAT (hereinafter PCQAT), and distillation were performed for the compression of ResNet18. The final evaluation of the obtained models was carried out on a Raspberry Pi 4 device using the validation dataset. The greatest model compression result on the disk was achieved by applying the PCQAT method, whose application led to a reduction in size of the initial model by as much as 45 times, whereas the greatest model acceleration result was achieved via distillation on the MobileNetV2 model. All methods led to the compression of the initial size of the model, with a slight loss in the model accuracy or an increase in the model accuracy in the case of QAT and weight clustering. INT8 quantization and knowledge distillation also led to a significant decrease in the model execution time.

引用

页数：21

共 42 条

[1] Literature review: efficient deep neural networks techniques for medical image analysis
Abdou, Mohamed A.
[J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (08) : 5791 - 5812
[2] Features Dimensionality Reduction Approaches for Machine Learning Based Network Intrusion Detection
Abdulhammed, Razan
Musafer, Hassan
Alessa, Ali
Faezipour, Miad
Abuzneid, Abdelshakour
[J]. ELECTRONICS, 2019, 8 (03)
[3] Evaluation of Deep Neural Network Compression Methods for Edge Devices Using Weighted Score-Based Ranking Scheme
Ademola, Olutosin Ajibola
Leier, Mairo
Petlenkov, Eduard
[J]. SENSORS, 2021, 21 (22)
[4] Detection of flood disaster system based on IoT, big data and convolutional deep neural network
Anbarasan, M.
Muthu, BalaAnand
Sivaparthipan, C. B.
Sundarasekar, Revathi
Kadry, Seifedine
Krishnamoorthy, Sujatha
Samuel, Dinesh Jackson R.
Dasel, A. Antony
[J]. COMPUTER COMMUNICATIONS, 2020, 150 : 150 - 157
[5] Structured Pruning of Deep Convolutional Neural Networks
Anwar, Sajid
Hwang, Kyuyeon
Sung, Wonyong
[J]. ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
[6] A machine learning based IoT for providing an intrusion detection system for security
Atul, Dhanke Jyoti
Kamalraj, R.
Ramesh, G.
Sankaran, K. Sakthidasan
Sharma, Sudhir
Khasim, Syed
[J]. MICROPROCESSORS AND MICROSYSTEMS, 2021, 82
[7] Brajevic I, 2021, J PROCESS MANAG NEW, V9, P89, DOI [10.5937/jouproman2103089B, DOI 10.5937/JOUPROMAN2103089B]
[8] Bucilua C., 2006, 12 INT C KNOWL DISC, P535, DOI DOI 10.1145/1150402.1150464
[9] Cai H., 2019, arXiv
[10] Canziani A., 2016, ARXIV

← 1 2 3 4 5 →