SFAO: Sign-Flipping-Aware Optimization for Early-Stopping of Binarized Neural Networks

被引:0
作者
Kang, Ju Yeon [1 ]
Ryu, Chang Ho [2 ]
Kang, Suk Bong [1 ]
Han, Tae Hee [2 ,3 ]
机构
[1] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon 16419, South Korea
[2] Sungkyunkwan Univ, Dept Artificial Intelligence, Suwon 16419, South Korea
[3] Sungkyunkwan Univ, Dept Semicond Syst Engn, Suwon 16419, South Korea
关键词
Training; Computational efficiency; Neural networks; Computational modeling; Quantization (signal); Optimization; Backpropagation; Artificial intelligence; Machine learning; model compression; optimizer; efficient machine learning; binarized neural networks; layer freezing;
D O I
10.1109/ACCESS.2023.3332472
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the vital challenges for the binary neural networks (BNNs) is improving their inference performance by expanding their data representation capabilities for figuring out delicate patterns and nuances in the data. Addressing the explosive computational demands on neural network training is essential to guarantee sustainable development and scalable deployment. However, mitigating the increase in the computational cost during the training phase is critical for ensuring sustainability and scalability during deployment. In this study, an advanced sign-flipping-aware optimizer (SFAO) that focuses on BNNs was introduced to diminish the computational burden. SFAO balanced the model performance and computational cost through sign-flipping-aware updating rules throughout the training of BNNs. SFAO optimizer, tailored for BNNs with binary weight-specific updating rules, considerably reduced the computing resources needed for training on the CIFAR-10 dataset. Specifically, it surpassed the conventional full-precision updating rule by reducing the total instruction count by 21.89%. In contrast, SFAO showed a marginal 0.44% decline in the image classification accuracy relative to the updating rules for the full-precision parameters. Furthermore, the implementation of early stopping using the sign flip rate led to a notable reduction of 9.37% in the average computation time per network for the ImageNet dataset.
引用
收藏
页码:128306 / 128315
页数:10
相关论文
共 41 条
[1]  
Bengio Y, 2013, Arxiv, DOI arXiv:1308.3432
[2]  
Binkert Nathan, 2011, Computer Architecture News, V39, P1, DOI 10.1145/2024716.2024718
[3]  
Bulat A, 2019, Arxiv, DOI arXiv:1909.13863
[4]  
Cavigelli L, 2020, Arxiv, DOI arXiv:2001.01091
[5]  
Choi J, 2018, Arxiv, DOI arXiv:1805.06085
[6]  
Courbariaux M, 2016, Arxiv, DOI arXiv:1602.02830
[7]  
Courbariaux M, 2015, ADV NEUR IN, V28
[8]   Regularizing Activation Distribution for Training Binarized Deep Networks [J].
Ding, Ruizhou ;
Chin, Ting-Wu ;
Liu, Zeye ;
Marculescu, Diana .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11400-11409
[9]  
Ge SM, 2017, IEEE INT CON MULTI, P667, DOI 10.1109/ICME.2017.8019465
[10]  
Gu JX, 2019, AAAI CONF ARTIF INTE, P8344