Sparseout: Controlling Sparsity in Deep Networks

被引:4
作者
Khan, Najeeb [1 ]
Stavness, Ian [1 ]
机构
[1] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK, Canada
来源
ADVANCES IN ARTIFICIAL INTELLIGENCE | 2019年 / 11489卷
基金
加拿大自然科学与工程研究理事会;
关键词
NEURAL-NETWORKS; CONNECTIVITY;
D O I
10.1007/978-3-030-18305-9_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dropout is commonly used to help reduce overfitting in deep neural networks. Sparsity is a potentially important property of neural networks, but is not explicitly controlled by Dropout-based regularization. In this work, we propose Sparseout a simple and efficient variant of Dropout that can be used to control the sparsity of the activations in a neural network. We theoretically prove that Sparseout is equivalent to an L-q penalty on the features of a generalized linear model and that Dropout is a special case of Sparseout for neural networks. We empirically demonstrate that Sparseout is computationally inexpensive and is able to control the desired level of sparsity in the activations. We evaluated Sparseout on image classification and language modelling tasks to see the effect of sparsity on these tasks. We found that sparsity of the activations is favorable for language modelling performance while image classification benefits from denser activations. Sparseout provides a way to investigate sparsity in state-of-the-art deep learning models. Source code for Sparseout could be found at https://github.com/najeebkhan/sparseout.
引用
收藏
页码:296 / 307
页数:12
相关论文
共 50 条
[21]   On Measuring and Controlling the Spectral Bias of the Deep Image Prior [J].
Shi, Zenglin ;
Mettes, Pascal ;
Maji, Subhransu ;
Snoek, Cees G. M. .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (04) :885-908
[22]   On the Singularity in Deep Neural Networks [J].
Nitta, Tohru .
NEURAL INFORMATION PROCESSING, ICONIP 2016, PT IV, 2016, 9950 :389-396
[23]   Topology of Deep Neural Networks [J].
Naitzat, Gregory ;
Zhitnikov, Andrey ;
Lim, Lek-Heng .
JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
[24]   Theoretical issues in deep networks [J].
Poggio, Tomaso ;
Banburski, Andrzej ;
Liao, Qianli .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (48) :30039-30045
[25]   IMAGE DECLIPPING WITH DEEP NETWORKS [J].
Honig, Shachar ;
Werman, Michael .
2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, :3923-3927
[26]   Orthogonal Deep Neural Networks [J].
Li, Shuai ;
Jia, Kui ;
Wen, Yuxin ;
Liu, Tongliang ;
Tao, Dacheng .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (04) :1352-1368
[27]   Building Ensemble of Deep Networks: Convolutional Networks and Transformers [J].
Nanni, Loris ;
Loreggia, Andrea ;
Barcellona, Leonardo ;
Ghidoni, Stefano .
IEEE ACCESS, 2023, 11 :124962-124974
[28]   Deep Canonical Correlation Analysis Using Sparsity-Constrained Optimization for Nonlinear Process Monitoring [J].
Xiu, Xianchao ;
Miao, Zhonghua ;
Yang, Ying ;
Liu, Wanquan .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (10) :6690-6699
[29]   Nonlinear controlling of artificial muscle system with neural networks [J].
Tian, SP ;
Ding, GQ ;
Yan, DT ;
Lin, LM ;
Shi, M .
IEEE ROBIO 2004: Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2004, :56-59
[30]   Identifying Controlling Nodes in Neuronal Networks in Different Scales [J].
Tang, Yang ;
Gao, Huijun ;
Zou, Wei ;
Kurths, Juergen .
PLOS ONE, 2012, 7 (07)