The Possibility of Combining and Implementing Deep Neural Network Compression Methods

被引:12
作者
Predic, Bratislav [1 ]
Vukic, Uros [1 ]
Saracevic, Muzafer [2 ]
Karabasevic, Darjan [3 ]
Stanujkic, Dragisa [4 ]
机构
[1] Univ Nis, Fac Elect Engn, Aleksandra Medvedeva 14, Nish 18000, Serbia
[2] Univ Novi Pazar, Dept Comp Sci, Novi Pazar 36300, Serbia
[3] Univ Business Acad Novi Sad, Fac Appl Management Econ & Finance, Jevrejska 24, Belgrade 11000, Serbia
[4] Univ Belgrade, Tech Fac Bor, Vojske Jugoslavije 12, Bor 19210, Serbia
关键词
deep learning; convolutional neural networks; deep neural network model; combining methods; implementation; SYSTEM; IOT;
D O I
10.3390/axioms11050229
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In the paper, the possibility of combining deep neural network (DNN) model compression methods to achieve better compression results was considered. To compare the advantages and disadvantages of each method, all methods were applied to the ResNet18 model for pretraining to the NCT-CRC-HE-100K dataset while using CRC-VAL-HE-7K as the validation dataset. In the proposed method, quantization, pruning, weight clustering, QAT (quantization-aware training), preserve cluster QAT (hereinafter PCQAT), and distillation were performed for the compression of ResNet18. The final evaluation of the obtained models was carried out on a Raspberry Pi 4 device using the validation dataset. The greatest model compression result on the disk was achieved by applying the PCQAT method, whose application led to a reduction in size of the initial model by as much as 45 times, whereas the greatest model acceleration result was achieved via distillation on the MobileNetV2 model. All methods led to the compression of the initial size of the model, with a slight loss in the model accuracy or an increase in the model accuracy in the case of QAT and weight clustering. INT8 quantization and knowledge distillation also led to a significant decrease in the model execution time.
引用
收藏
页数:21
相关论文
共 42 条
  • [1] Literature review: efficient deep neural networks techniques for medical image analysis
    Abdou, Mohamed A.
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (08) : 5791 - 5812
  • [2] Features Dimensionality Reduction Approaches for Machine Learning Based Network Intrusion Detection
    Abdulhammed, Razan
    Musafer, Hassan
    Alessa, Ali
    Faezipour, Miad
    Abuzneid, Abdelshakour
    [J]. ELECTRONICS, 2019, 8 (03)
  • [3] Evaluation of Deep Neural Network Compression Methods for Edge Devices Using Weighted Score-Based Ranking Scheme
    Ademola, Olutosin Ajibola
    Leier, Mairo
    Petlenkov, Eduard
    [J]. SENSORS, 2021, 21 (22)
  • [4] Detection of flood disaster system based on IoT, big data and convolutional deep neural network
    Anbarasan, M.
    Muthu, BalaAnand
    Sivaparthipan, C. B.
    Sundarasekar, Revathi
    Kadry, Seifedine
    Krishnamoorthy, Sujatha
    Samuel, Dinesh Jackson R.
    Dasel, A. Antony
    [J]. COMPUTER COMMUNICATIONS, 2020, 150 : 150 - 157
  • [5] Structured Pruning of Deep Convolutional Neural Networks
    Anwar, Sajid
    Hwang, Kyuyeon
    Sung, Wonyong
    [J]. ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
  • [6] A machine learning based IoT for providing an intrusion detection system for security
    Atul, Dhanke Jyoti
    Kamalraj, R.
    Ramesh, G.
    Sankaran, K. Sakthidasan
    Sharma, Sudhir
    Khasim, Syed
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2021, 82
  • [7] Brajevic I, 2021, J PROCESS MANAG NEW, V9, P89, DOI [10.5937/jouproman2103089B, DOI 10.5937/JOUPROMAN2103089B]
  • [8] Bucilua C., 2006, 12 INT C KNOWL DISC, P535, DOI DOI 10.1145/1150402.1150464
  • [9] Cai H., 2019, arXiv
  • [10] Canziani A., 2016, ARXIV