Attention-based capsule network with shared parameters

被引：0

作者：

Song Y. ^{[1
]}

Qin Y.-Z. ^{[1
]}

Zeng R. ^{[1
]}

机构：

[1] Department of Control Science and Engineering, University of Shanghai for Science and Technology, Shanghai

来源：

Kongzhi yu Juece/Control and Decision | 2023年 / 38卷 / 06期

关键词：

adversarial attacks; attention; capsule network; image classification; robustness; shared parameters;

D O I：

10.13195/j.kzyjc.2021.1825

中图分类号：

学科分类号：

摘要：

Aiming to handle the problem of propagation redundancy and deconstruction inefficiency of features in traditional capsule networks, this paper proposes an attention-based capsule network with shared parameters. The merits of such a network lie mainly in the following two issues: 1) A dynamic routing method based on an attention mechanism is proposed. This method calculates the correlation between low-level capsules to maintain the space information of features and pay more attention to the feature information with a high correlation, thus fulfilling the forward propagation; 2) A shared transformation matrix is developed in the dynamic routing layer. The high-level capsules are activated based on the voting consistency of the low-level capsules. Then, the transformation matrix with shared parameters is used to reduce the parameters of the model and obtain the robustness of the capsule network. Experimental results of comparison classification on five public datasets show that the proposed capsule network achieves the best classification results of 5.17 %, 3.67 % and 9.35 % on the Fashion-MNIST, SVHN and CIFAR 10 datasets, respectively. Moreover, it has significant robustness against the white-box anti-attack. In addition, the transformation experimental results on smallNORB and affNISH public datasets show that the proposed capsule network has obvious robustness to the transformation. Finally, the experimental results of computational efficiency show that the proposed capsule network with shared parameters reduces the parameters of traditional capsule networks by 4.9 % without adding floating-point operations and has an overwhelming advantage in computation. © 2023 Northeast University. All rights reserved.

引用

页码：1577 / 1585

页数：8

共 23 条

[1] Aharon A, Weiss Y., Why do deep convolutional networks generalize so poorly to small image transformations?, Journal of Machine Learning Research, 20, 184, pp. 1-25, (2019)
[2] Goodfellow I J, Shlens J, Szegedy C., Explainingand harnessing adversarial examples, International Conference on Learning Representations, pp. 1-11, (2015)
[3] Kurakin A, Goodfellow I, Bengio S., Adversarial examples in the physical world, International Conferenceon Learning Representations, pp. 1-14, (2017)
[4] Su J W, Vargas D V, Sakurai K., One pixel attack for fooling deep neural networks, IEEE Transactions on Evolutionary Computation, 23, 5, pp. 828-841, (2019)
[5] Lenssen J E, Fey M, Libuschewski P., Group equivariant capsule networks, Advances in Neural Information Processing Systems, pp. 8844-8853, (2018)
[6] Sabour S, Frosst N, Hinton G E., Dynamic routing betweencapsules, Advances in Neural Information Processing Systems, pp. 3856-3866, (2017)
[7] Hinton G E, Sabour S, Frosst N., Matrix capsules with EM routing, International Conference on Learning Representations, pp. 1-15, (2018)
[8] Deliege A, Cioppa A, Droogenbroeck M., An effective hit-or-miss layer favoring feature interpretation as learned prototypes deformations, AAAI Conference on Artificial Intelligence, pp. 1-8, (2019)
[9] Xiang C Q, Zhang L, Tang Y, Et al., MS-CapsNet: A novel multi-scale capsule network, IEEE Signal Processing Letters, 25, 12, pp. 1850-1854, (2018)
[10] Song Y, Wang Y., Multi-stage attention-based capsule networks for image classification, Acta Automatica Sinica, 47, x, pp. 1-14, (2021)

← 1 2 3 →