Deep Neural Networks With Trainable Activations and Controlled Lipschitz Constant

被引:22
作者
Aziznejad, Shayan [1 ]
Gupta, Harshit [1 ]
Campos, Joaquim [1 ]
Unser, Michael [1 ]
机构
[1] Ecole Polytech Fed Lausanne, Biomed Imaging Grp, CH-1015 Lausanne, Switzerland
基金
瑞士国家科学基金会; 欧洲研究理事会;
关键词
Deep learning; deep splines; learned activations; lipschitz regularity; representer theorem; INVERSE PROBLEMS; APPROXIMATION; SPLINES;
D O I
10.1109/TSP.2020.3014611
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We introduce a variational framework to learn the activation functions of deep neural networks. Our aim is to increase the capacity of the network while controlling an upper-bound of the actual Lipschitz constant of the input-output relation. To that end, we first establish a global bound for the Lipschitz constant of neural networks. Based on the obtained bound, we then formulate a variational problem for learning activation functions. Our variational problem is infinite-dimensional and is not computationally tractable. However, we prove that there always exists a solution that has continuous and piecewise-linear (linear-spline) activations. This reduces the original problem to a finite-dimensional minimization where an l(1) penalty on the parameters of the activations favors the learning of sparse nonlinearities. We numerically compare our scheme with standard ReLU network and its variations, PReLU and LeakyReLU and we empirically demonstrate the practical aspects of our framework.
引用
收藏
页码:4688 / 4699
页数:12
相关论文
共 47 条
[1]  
Agostinelli F., 2014, arXiv preprint arXiv:1412.6830
[2]  
[Anonymous], 2015, TECH REP
[3]  
[Anonymous], 1991, P ADV NEUR INF PROC
[4]  
[Anonymous], 2017, P INT C LEARN REPR
[5]  
[Anonymous], 2005, Lectures on Lipschitz analysis
[6]  
[Anonymous], 2014, An Introduction to Sparse Stochastic Processes
[7]  
Antun V., 2020, P NAT ACAD SCI
[8]   THEORY OF REPRODUCING KERNELS [J].
ARONSZAJN, N .
TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY, 1950, 68 (MAY) :337-404
[9]  
Aziznejad S, 2019, INT CONF ACOUST SPEE, P3242, DOI [10.1109/ICASSP.2019.8682547, 10.1109/icassp.2019.8682547]
[10]  
Bishop C. M., 2006, Pattern Recognition and Machine Learning