A New Framework to Train Autoencoders Through Non-Smooth Regularization

被引:11
作者
Amini, Sajjad [1 ,2 ]
Ghaemmaghami, Shahrokh [1 ,2 ]
机构
[1] Sharif Univ Technol, Dept Elect Engn, Tehran 1136511155, Iran
[2] Sharif Univ Technol, Elect Res Inst, Tehran 1136511155, Iran
关键词
Autoencoder; regularizer; gradient descent; proximal operator; LEARNING ALGORITHM; NEURAL-NETWORKS; SPARSE; REPRESENTATION; OPTIMIZATION; FACTORIZATION;
D O I
10.1109/TSP.2019.2899294
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep structures consisting of many layers of nonlinearities have a high potential of expressing complex relations if properly initialized. Autoencoders play a complementary role in training a deep structure by initializing each layer in a greedy unsupervised manner. Due to the high capacity presented by autoencoders, these structures need to be regularized. While mathematical regularizers (based on weight decay, sparsity, etc.) and structural ones (by way of, e.g., denoising and dropout) have been well studied in the literature, quite a few papers have addressed the problem of training autoencoder with non-smooth regularization. In this paper, we address the problem of training autoencoder with non-smooth regularization. We propose an efficient algorithm and mathematically prove that it is convergent, where the regularizer needs to be proximable. As one of major applications of the proposed method, we get focused on the problem of sparse autoencoders and show that the new training method leads to better disentangling of factors of variation.
引用
收藏
页码:1860 / 1874
页数:15
相关论文
共 66 条
[1]   K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J].
Aharon, Michal ;
Elad, Michael ;
Bruckstein, Alfred .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (11) :4311-4322
[2]  
[Anonymous], COMP VIS PATT REC 20
[3]  
[Anonymous], 2015, ARXIV PREPRINT ARXIV
[4]  
[Anonymous], 2003, BASIC REAL ANAL
[5]  
[Anonymous], 2015, Nature, DOI [10.1038/nature14539, DOI 10.1038/NATURE14539]
[6]  
[Anonymous], INT C LEARN REPR
[7]  
[Anonymous], 2011, LECT NOTES STANFORD
[8]  
[Anonymous], 2011, P NEURIPS WORKSH GRA
[9]  
[Anonymous], CORR
[10]   Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods [J].
Attouch, Hedy ;
Bolte, Jerome ;
Svaiter, Benar Fux .
MATHEMATICAL PROGRAMMING, 2013, 137 (1-2) :91-129