Sparseout: Controlling Sparsity in Deep Networks

被引：4

作者：

Khan, Najeeb ^{[1
]}

Stavness, Ian ^{[1
]}

机构：

[1] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK, Canada

来源：

ADVANCES IN ARTIFICIAL INTELLIGENCE | 2019年 / 11489卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

NEURAL-NETWORKS; CONNECTIVITY;

D O I：

10.1007/978-3-030-18305-9_24

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dropout is commonly used to help reduce overfitting in deep neural networks. Sparsity is a potentially important property of neural networks, but is not explicitly controlled by Dropout-based regularization. In this work, we propose Sparseout a simple and efficient variant of Dropout that can be used to control the sparsity of the activations in a neural network. We theoretically prove that Sparseout is equivalent to an L-q penalty on the features of a generalized linear model and that Dropout is a special case of Sparseout for neural networks. We empirically demonstrate that Sparseout is computationally inexpensive and is able to control the desired level of sparsity in the activations. We evaluated Sparseout on image classification and language modelling tasks to see the effect of sparsity on these tasks. We found that sparsity of the activations is favorable for language modelling performance while image classification benefits from denser activations. Sparseout provides a way to investigate sparsity in state-of-the-art deep learning models. Source code for Sparseout could be found at https://github.com/najeebkhan/sparseout.

引用

页码：296 / 307

页数：12

共 50 条

[31] Z-PIM: A Sparsity-Aware Processing-in-Memory Architecture With Fully Variable Weight Bit-Precision for Energy-Efficient Deep Neural Networks [J].

Kim, Ji-Hoon ;

Lee, Juhyoung ;

Lee, Jinsu ;

Heo, Jaehoon ;

Kim, Joo-Young .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2021, 56 (04) :1093-1104

[32] Modeling Hierarchical Brain Networks via Volumetric Sparse Deep Belief Network [J].

Dong, Qinglin ;

Ge, Fangfei ;

Ning, Qiang ;

Zhao, Yu ;

Lv, Jinglei ;

Huang, Heng ;

Yuan, Jing ;

Jian, Xi ;

Shen, Dinggang ;

Liu, Tianming .

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2020, 67 (06) :1739-1748

[33] Deep Twin Support Vector Networks [J].

Li, Mingchen ;

Yang, Zhiji .

ARTIFICIAL INTELLIGENCE, CICAI 2022, PT III, 2022, 13606 :94-106

[34] Deep learning systems as complex networks [J].

Testolin, Alberto ;

Piccolini, Michele ;

Suweis, Samir .

JOURNAL OF COMPLEX NETWORKS, 2020, 8 (01)

[35] Surprising properties of dropout in deep networks [J].

Helmbold, David P. ;

Long, Philip M. .

JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18

[36] Archetypal landscapes for deep neural networks [J].

Verpoort, Philipp C. ;

Lee, Alpha A. ;

Wales, David J. .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (36) :21857-21864

[37] DEEP NEURAL NETWORKS FOR ESTIMATION AND INFERENCE [J].

Farrell, Max H. ;

Liang, Tengyuan ;

Misra, Sanjog .

ECONOMETRICA, 2021, 89 (01) :181-213

[38] Deep networks for motor control functions [J].

Berniker, Max ;

Kording, Konrad P. .

FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2015, 9

[39] Probabilistic Models with Deep Neural Networks [J].

Masegosa, Andres R. ;

Cabanas, Rafael ;

Langseth, Helge ;

Nielsen, Thomas D. ;

Salmeron, Antonio .

ENTROPY, 2021, 23 (01) :1-27

[40] Fast Algorithms for Deep Octonion Networks [J].

Cariow, Aleksandr ;

Cariowa, Galina .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (01) :543-548

← 1 2 3 4 5 →