Mask Dynamic Routing to Combined Model of Deep Capsule Network and U-Net

被引：20

作者：

Chen, Junying ^{[1
]}

Liu, Zhan ^{[1
]}

机构：

[1] South China Univ Technol, Sch Software Engn, Guangzhou 510006, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2020年 / 31卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Routing; Prediction algorithms; Image reconstruction; Clustering algorithms; Data models; Computational modeling; Heuristic algorithms; Capsule network; DCN-UN model; equivariance; mask dynamic routing (DR); reconstruction;

D O I：

10.1109/TNNLS.2020.2984686

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The capsule network is a novel architecture to encode feature attributes and spatial relationships of an image. By using the dynamic routing (DR) algorithm, a capsule network (CapsNet) model can be trained. However, the original three-layer CapsNet with the DR algorithm performs poorly on complex data sets, such as FashionMNIST, CIFAR-10, and CIFAR-100. This deficiency limits the wider application of capsule networks. In this article, we propose a deep capsule network model combined with a U-Net preprocessing module (DCN-UN). Local connection and weight-sharing strategies are adopted from convolutional neural networks to design a convolutional capsule layer in the DCN-UN model. This allows considerably reducing the number of parameters in the network model. Moreover, a greedy strategy is incorporated into the design of a mask DR (MDR) algorithm to improve the performance of network models. DCN-UN requires up to five times fewer parameters compared with the original CapsNet and other CapsNet-based models. The performance improvement achieved by the DCN-UN model with the MDR algorithm over the original CapsNet model with the DR algorithm is approximately 12% and 17% on the CIFAR-10 and CIFAR-100 data sets, respectively. The experimental results confirm that the proposed DCN-UN model allows preserving advantages of image reconstruction and equivariance mechanism in capsule networks. Moreover, an efficient initialization method is explored to enhance training stability and avoid gradient explosion.

引用

页码：2653 / 2664

页数：12

共 27 条

[1] [Anonymous], 2017, Advances in neural information processing systems
[2] Azulay Aharon., 2018, Why do deep convolutional networks generalize so poorly to small image transformations?
[3] Deliege A, 2018, ARXIV PREPRINT ARXIV
[4] Duarte K, 2018, ADV NEUR IN, V31
[5] Fairclough G, 2018, ROUTL HANDBK, P1
[6] Fundamental Technologies in Modern Speech Recognition
Furui, Sadaoki
Deng, Li
Gales, Mark
Ney, Hermann
Tokuda, Keiichi
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 16 - 17
[7] Glorot X, 2010, Proceedings of Machine Learning Research, P249
[8] Goodfellow Ian, 2014, NEURIPS, V27
[9] Hinton GE, 2011, LECT NOTES COMPUT SC, V6791, P44, DOI 10.1007/978-3-642-21735-7_6
[10] CapsuleGAN: Generative Adversarial Capsule Network
Jaiswal, Ayush
AbdAlmageed, Wael
Wu, Yue
Natarajan, Premkumar
[J]. COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III, 2019, 11131 : 526 - 535

← 1 2 3 →