Attention-Based Batch Normalization for Binary Neural Networks

被引:1
作者
Gu, Shan [1 ]
Zhang, Guoyin [1 ]
Jia, Chengwei [1 ]
Wu, Yanxia [1 ]
机构
[1] Harbin Engn Univ, Coll Comp Sci & Technol, Harbin 150009, Peoples R China
关键词
Binary neural networks; batch normalizationa; deep learning; convolutional neural networks;
D O I
10.3390/e27060645
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Batch normalization (BN) is crucial for achieving state-of-the-art binary neural networks (BNNs). Unlike full-precision neural networks, BNNs restrict activations to discrete values {-1,1}, which requires a renewed understanding and research of the role and significance of the BN layers in BNNs. Many studies notice this phenomenon and try to explain it. Inspired by these studies, we introduce the self-attention mechanism into BN and propose a novel Attention-Based Batch Normalization (ABN) for Binary Neural Networks. Also, we present an ablation study of parameter trade-offs in ABN, as well as an experimental analysis of the effect of ABN on BNNs. Experimental analyses show that our ABN method helps to capture image features, provide additional activation-like functions, and increase the imbalance of the activation distribution, and these features help to improve the performance of BNNs. Furthermore, we conduct image classification experiments over the CIFAR10, CIFAR100, and TinyImageNet datasets using BinaryNet and ResNet-18 network structures. The experimental results demonstrate that our ABN consistently outperforms the baseline BN across various benchmark datasets and models in terms of image classification accuracy. In addition, ABN exhibits less variance on the CIFAR datasets, which suggests that ABN can improve the stability and reliability of models.
引用
收藏
页数:13
相关论文
共 33 条
[1]   Attention Augmented Convolutional Networks [J].
Bello, Irwan ;
Zoph, Barret ;
Vaswani, Ashish ;
Shlens, Jonathon ;
Le, Quoc V. .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3285-3294
[2]  
Bengio Y, 2013, Arxiv, DOI arXiv:1308.3432
[3]  
Bjorck J, 2018, ADV NEUR IN, V31
[4]  
Bulat A, 2019, Arxiv, DOI arXiv:1904.05868
[5]   "BNN - BN = ?": Training Binary Neural Networks without Batch Normalization [J].
Chen, Tianlong ;
Zhang, Zhenyu ;
Ouyang, Xu ;
Liu, Zechun ;
Shen, Zhiqiang ;
Wang, Zhangyang .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, :4614-4624
[6]  
Courbariaux M, 2016, Arxiv, DOI arXiv:1602.02830
[7]  
Courbariaux M, 2015, ADV NEUR IN, V28
[8]   Regularizing Activation Distribution for Training Binarized Deep Networks [J].
Ding, Ruizhou ;
Chin, Ting-Wu ;
Liu, Zeye ;
Marculescu, Diana .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11400-11409
[9]   Spatially Adaptive Computation Time for Residual Networks [J].
Figurnov, Michael ;
Collins, Maxwell D. ;
Zhu, Yukun ;
Zhang, Li ;
Huang, Jonathan ;
Vetrov, Dmitry ;
Salakhutdinov, Ruslan .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1790-1799
[10]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778