Mask Dynamic Routing to Combined Model of Deep Capsule Network and U-Net

被引:20
作者
Chen, Junying [1 ]
Liu, Zhan [1 ]
机构
[1] South China Univ Technol, Sch Software Engn, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Routing; Prediction algorithms; Image reconstruction; Clustering algorithms; Data models; Computational modeling; Heuristic algorithms; Capsule network; DCN-UN model; equivariance; mask dynamic routing (DR); reconstruction;
D O I
10.1109/TNNLS.2020.2984686
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The capsule network is a novel architecture to encode feature attributes and spatial relationships of an image. By using the dynamic routing (DR) algorithm, a capsule network (CapsNet) model can be trained. However, the original three-layer CapsNet with the DR algorithm performs poorly on complex data sets, such as FashionMNIST, CIFAR-10, and CIFAR-100. This deficiency limits the wider application of capsule networks. In this article, we propose a deep capsule network model combined with a U-Net preprocessing module (DCN-UN). Local connection and weight-sharing strategies are adopted from convolutional neural networks to design a convolutional capsule layer in the DCN-UN model. This allows considerably reducing the number of parameters in the network model. Moreover, a greedy strategy is incorporated into the design of a mask DR (MDR) algorithm to improve the performance of network models. DCN-UN requires up to five times fewer parameters compared with the original CapsNet and other CapsNet-based models. The performance improvement achieved by the DCN-UN model with the MDR algorithm over the original CapsNet model with the DR algorithm is approximately 12% and 17% on the CIFAR-10 and CIFAR-100 data sets, respectively. The experimental results confirm that the proposed DCN-UN model allows preserving advantages of image reconstruction and equivariance mechanism in capsule networks. Moreover, an efficient initialization method is explored to enhance training stability and avoid gradient explosion.
引用
收藏
页码:2653 / 2664
页数:12
相关论文
共 27 条
  • [1] [Anonymous], 2017, Advances in neural information processing systems
  • [2] Azulay Aharon., 2018, Why do deep convolutional networks generalize so poorly to small image transformations?
  • [3] Deliege A, 2018, ARXIV PREPRINT ARXIV
  • [4] Duarte K, 2018, ADV NEUR IN, V31
  • [5] Fairclough G, 2018, ROUTL HANDBK, P1
  • [6] Fundamental Technologies in Modern Speech Recognition
    Furui, Sadaoki
    Deng, Li
    Gales, Mark
    Ney, Hermann
    Tokuda, Keiichi
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 16 - 17
  • [7] Glorot X, 2010, Proceedings of Machine Learning Research, P249
  • [8] Goodfellow Ian, 2014, NEURIPS, V27
  • [9] Hinton GE, 2011, LECT NOTES COMPUT SC, V6791, P44, DOI 10.1007/978-3-642-21735-7_6
  • [10] CapsuleGAN: Generative Adversarial Capsule Network
    Jaiswal, Ayush
    AbdAlmageed, Wael
    Wu, Yue
    Natarajan, Premkumar
    [J]. COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III, 2019, 11131 : 526 - 535