Dynamic Capacity Networks

被引:0
作者
Almahairi, Amjad [1 ]
Ballas, Nicolas [1 ]
Cooijmans, Tim [1 ]
Zheng, Yin [2 ]
Larochelle, Hugo [3 ]
Courville, Aaron [1 ]
机构
[1] Univ Montreal, MILA, Quebec City, PQ, Canada
[2] Hulu LLC, Beijing, Peoples R China
[3] Twitter, Cambridge, MA USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48 | 2016年 / 48卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce the Dynamic Capacity Network (DCN), a neural network that can adaptively assign its capacity across different portions of the input data. This is achieved by combining modules of two types: low-capacity sub-networks and high-capacity sub-networks. The low-capacity sub-networks are applied across most of the input, but also provide a guide to select a few portions of the input on which to apply the high-capacity sub-networks. The selection is made using a novel gradient-based attention mechanism, that efficiently identifies input regions for which the DCN's output is most sensitive and to which we should devote more capacity. We focus our empirical evaluation on the Cluttered MNIST and SVHN image datasets. Our findings indicate that DCNs are able to drastically reduce the number of computations, compared to traditional convolutional neural networks, while maintaining similar or even better performance.
引用
收藏
页数:10
相关论文
共 27 条
[1]  
[Anonymous], 2014, arXiv
[2]  
Bart Van Merrienboer, 2015, 150600619 ARXIV
[3]  
Bastien F., 2012, Theano: new features and speed improvements
[4]  
Bengio Yoshua, 2013, Statistical Language and Speech Processing. First International Conference, SLSP 2013. Proceedings: LNCS 7978, P1, DOI 10.1007/978-3-642-39593-2_1
[5]  
Bengio Y., 2013, ARXIV
[6]  
Bergstra J, 2011, NIPS 11
[7]  
Bucilua C., 2006, KDD
[8]   Approximate Nearest Neighbor Search by Residual Vector Quantization [J].
Chen, Yongjian ;
Guan, Tao ;
Wang, Cheng .
SENSORS, 2010, 10 (12) :11259-11273
[9]   Learning Where to Attend with Deep Architectures for Image Tracking [J].
Denil, Misha ;
Bazzani, Loris ;
Larochelle, Hugo ;
de Freitas, Nando .
NEURAL COMPUTATION, 2012, 24 (08) :2151-2184
[10]  
Denton EL, 2014, Advances in neural information processing systems, P1269