Sparseout: Controlling Sparsity in Deep Networks

被引：4

作者：

Khan, Najeeb ^{[1
]}

Stavness, Ian ^{[1
]}

机构：

[1] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK, Canada

来源：

ADVANCES IN ARTIFICIAL INTELLIGENCE | 2019年 / 11489卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

NEURAL-NETWORKS; CONNECTIVITY;

D O I：

10.1007/978-3-030-18305-9_24

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dropout is commonly used to help reduce overfitting in deep neural networks. Sparsity is a potentially important property of neural networks, but is not explicitly controlled by Dropout-based regularization. In this work, we propose Sparseout a simple and efficient variant of Dropout that can be used to control the sparsity of the activations in a neural network. We theoretically prove that Sparseout is equivalent to an L-q penalty on the features of a generalized linear model and that Dropout is a special case of Sparseout for neural networks. We empirically demonstrate that Sparseout is computationally inexpensive and is able to control the desired level of sparsity in the activations. We evaluated Sparseout on image classification and language modelling tasks to see the effect of sparsity on these tasks. We found that sparsity of the activations is favorable for language modelling performance while image classification benefits from denser activations. Sparseout provides a way to investigate sparsity in state-of-the-art deep learning models. Source code for Sparseout could be found at https://github.com/najeebkhan/sparseout.

引用

页码：296 / 307

页数：12

共 50 条

[41] Nonlinear Approximation and (Deep) ReLU Networks [J].

Daubechies, I. ;

DeVore, R. ;

Foucart, S. ;

Hanin, B. ;

Petrova, G. .

CONSTRUCTIVE APPROXIMATION, 2022, 55 (01) :127-172

[42] Deep Huber quantile regression networks [J].

Tyralis, Hristos ;

Papacharalampous, Georgia ;

Dogulu, Nilay ;

Chun, Kwok P. .

NEURAL NETWORKS, 2025, 187

[43] Decision Boundaries of Deep Neural Networks [J].

Karimi, Hamid ;

Derr, Tyler .

2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, :1085-1092

[44] Fast Training of Deep LSTM Networks [J].

Yu, Wen ;

Li, Xiaoou ;

Gonzalez, Jesus .

ADVANCES IN NEURAL NETWORKS - ISNN 2019, PT I, 2019, 11554 :3-10

[45] MULTILINGUAL TRAINING OF DEEP NEURAL NETWORKS [J].

Ghoshal, Arnab ;

Swietojanski, Pawel ;

Renals, Steve .

2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, :7319-7323

[46] ε-Weakened Robustness of Deep Neural Networks [J].

Huang, Pei ;

Yang, Yuting ;

Liu, Minghao ;

Jia, Fuqi ;

Ma, Feifei ;

Zhang, Jian .

PROCEEDINGS OF THE 31ST ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2022, 2022, :126-138

[47] Ransomware Detection with Deep Neural Networks [J].

Davidian, Matan ;

Vanetik, Natalia ;

Kiperberg, Michael .

PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS SECURITY AND PRIVACY (ICISSP), 2021, :656-663

[48] Concolic Testing for Deep Neural Networks [J].

Sun, Youcheng ;

Wu, Min ;

Ruan, Wenjie ;

Huang, Xiaowei ;

Kwiatkowska, Marta ;

Kroening, Daniel .

PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18), 2018, :109-119

[49] Visual Genealogy of Deep Neural Networks [J].

Wang, Qianwen ;

Yuan, Jun ;

Chen, Shuxin ;

Su, Hang ;

Qu, Huamin ;

Liu, Shixia .

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2020, 26 (11) :3340-3352

[50] A survey on the applications of Deep Neural Networks [J].

Latha, R. S. ;

Sreekanth, G. R. R. ;

Suganthe, R. C. ;

Selvaraj, R. Esakki .

2021 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI), 2021,

← 1 2 3 4 5 →