MS-NET: modular selective network Round robin based modular neural network architecture with limited redundancy

被引：3

作者：

Chowdhury, Intisar Md ^{[1
]}

Su, Kai ^{[1
]}

Zhao, Qiangfu ^{[1
]}

机构：

[1] Univ Aizu, Syst Intelligence Lab, Aizu Wakamatsu, Fukushima 9658580, Japan

来源：

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS | 2021年 / 12卷 / 03期

关键词：

Modular neural networks; Deep learning; Knowledge-distillation; Multi-class classification; Image classification; CLASSIFICATION; RECOGNITION; MIXTURES;

D O I：

10.1007/s13042-020-01201-8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a modular architecture of Deep Neural Network (DNN) for multi-class classification task. The architecture consists of two parts, a router network and a set of expert networks. In this architecture, for a C-class classification problem, we have exactly C experts. The backbone network for these experts and the router are built with simple and identical DNN architecture. For each class, the modular network has a certain number rho of expert networks specializing in that particular class, where rho is called the redundancy rate in this study. We demonstrate that rho plays a vital role in the performance of the network. Although these experts are light weight and weak learners alone, together they match the performance of more complex DNNs. We train the network in two phase wherein, first the router is trained on the whole set of training data followed by training each expert network enforced by a new stochastic objective function that facilitates alternative training on a small subset of expert data and the whole set of data. This alternative training provides an additional form of regularization and avoids over-fitting the expert network on subset data. During the testing phase, the router dynamically selects a fixed number of experts for further evaluation of the input datum. The modular nature and low parameter requirement of the network makes it very suitable in distributed and low computational environments. Extensive empirical study and theoretical analysis on CIFAR-10, CIFAR-100 and F-MNIST substantiate the effectiveness and efficiency of our proposed modular network.

引用

页码：763 / 781

页数：19

共 80 条

[51] Loshchilov I., 2016, ARXIV160803983
[52] Lowe D.G., 1999, P 7 IEEE INT C COMP, P1150, DOI [DOI 10.1109/ICCV.1999.790410, 10.1109/ICCV.1999.790410]
[53] Minsky Marvin, 1988, SOC OF MIND
[54] Mnih V., 2013, ARXIV, V1312, P5602
[55] Human-level control through deep reinforcement learning
Mnih, Volodymyr
Kavukcuoglu, Koray
Silver, David
Rusu, Andrei A.
Veness, Joel
Bellemare, Marc G.
Graves, Alex
Riedmiller, Martin
Fidjeland, Andreas K.
Ostrovski, Georg
Petersen, Stig
Beattie, Charles
Sadik, Amir
Antonoglou, Ioannis
King, Helen
Kumaran, Dharshan
Wierstra, Daan
Legg, Shane
Hassabis, Demis
[J]. NATURE, 2015, 518 (7540) : 529 - 533
[56] Molchanov Pavlo, 2016, ARXIV161106440
[57] Nayman N, 2019, ADV NEUR IN, V32
[58] Efficient prediction algorithms for binary decomposition techniques
Park, Sang-Hyeun
Fuernkranz, Johannes
[J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2012, 24 (01) : 40 - 77
[59] Pham H, 2018, PR MACH LEARN RES, V80
[60] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Ren, Shaoqing
He, Kaiming
Girshick, Ross
Sun, Jian
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) : 1137 - 1149

← 1 2 3 4 5 6 7 8 →