A lightweight capsule network via channel-space decoupling and self-attention routing

被引:2
作者
Guo, Yifan [1 ]
Zhang, Sulan [1 ]
Zhang, Chunmei [2 ]
Gao, Hongli [1 ]
Li, Huajie [1 ]
机构
[1] Taiyuan Univ Sci & Technol, Sch Comp Sci & Technol, Taiyuan 030024, Shanxi, Peoples R China
[2] Taiyuan Univ Sci & Technol, Sch Elect Informat, Taiyuan 030024, Shanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Capsule network; Lightweight network; DSA-CapsNet; Channel-space decoupling; Self-attention routing;
D O I
10.1007/s11042-024-18861-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Compared to traditional convolutional neural networks (CNNs), the Capsule network (CapsNet), due to its capsule-based design that aligns better with the principle of human neurons, possesses stronger representation ability by capturing potential spatial structural relationships among different parts of an entity. However, the transformation of neurons into capsules and the iterative routing mechanism result in a considerable computational burden, which is the main drawback of CapsNet. Additionally, the fully connected decoder network exhibits significant reconstruction errors on more complex datasets (e.g., CIFAR-10), which adversely affects the model's classification performance. To this end, this paper proposes a Lightweight Capsule Network (DSA-CapsNet) based on channel-space decoupling and self-attention routing. First, a set of residual blocks is employed to construct the residual extraction layer, where the deep features are decoupled to respectively model the correlations within channel and space, thereby reducing the number of parameters and generating initial capsules simultaneously. Secondly, a self-attention routing algorithm is introduced between capsule layers to effectively handle fewer capsules and then allow for stacking more layers. Lastly, a deconvolution decoder module is used as a better reconstruction method, replacing the fully connected decoder in CapsNet. Through the evaluation of four benchmark datasets, DSA-CapsNet drastically reduces the number of parameters and runtime while exhibiting better classification results. Particularly, on the CIFAR-10 dataset, DSA-CapsNet achieves a 75.38%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} reduction in parameters compared to the original CapsNet, with a 25.71%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} increase in classification accuracy.
引用
收藏
页码:83513 / 83533
页数:21
相关论文
共 55 条
[1]  
Ahmed K, 2019, ADV NEUR IN, V32
[2]   Path Capsule Networks [J].
Amer, Mohammed ;
Maul, Tomas .
NEURAL PROCESSING LETTERS, 2020, 52 (01) :545-559
[3]  
Chen ZH, 2018, Arxiv, DOI arXiv:1808.08692
[4]   Attention Routing Between Capsules [J].
Choi, Jaewoong ;
Seo, Hyun ;
Im, Suii ;
Kang, Myungjoo .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :1981-1989
[5]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[6]  
Ribeiro FD, 2019, Arxiv, DOI arXiv:1905.11455
[7]   Enhanced sentiment extraction architecture for social media content analysis using capsule networks [J].
Demotte, P. ;
Wijegunarathna, K. ;
Meedeniya, D. ;
Perera, I .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (06) :8665-8690
[8]  
Ding XP, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2237
[9]   Group Feedback Capsule Network [J].
Ding, Xinpeng ;
Wang, Nannan ;
Gao, Xinbo ;
Li, Jie ;
Wang, Xiaoyu ;
Liu, Tongliang .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :6789-6799
[10]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929