Optimal deep neural networks by maximization of the approximation power

被引:6
作者
Calvo-Pardo, Hector [1 ,2 ,3 ,4 ]
Mancini, Tullio [1 ]
Olmo, Jose [1 ,5 ,6 ]
机构
[1] Univ Southampton, Southampton, England
[2] CFS, Osnabruck, Germany
[3] CPC, London, England
[4] ILB, Paris, France
[5] Univ Zaragoza, Zaragoza, Spain
[6] Univ Southampton, Dept Econ, Highfield Campus, Southampton SO17 1BJ, England
关键词
Machine learning; Artificial intelligence; Data science; Forecasting; Feedforward neural networks; PERFORMANCE;
D O I
10.1016/j.cor.2023.106264
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We propose an optimal architecture for deep neural networks of given size. The optimal architecture obtains from maximizing the lower bound of the maximum number of linear regions approximated by a deep neural network with a ReLu activation function. The accuracy of the approximation function relies on the neural network structure characterized by the number, dependence and hierarchy between the nodes within and across layers. We show how the accuracy of the approximation improves as we optimally choose the width and depth of the network. A Monte-Carlo simulation exercise illustrates the outperformance of the optimized architecture against cross-validation methods and gridsearch for linear and nonlinear prediction models. The application of this methodology to the Boston Housing dataset confirms empirically the outperformance of our method against state-of the-art machine learning models.
引用
收藏
页数:15
相关论文
共 59 条
[1]  
Aggarwal C.C., 2006, Neural Networks and Deep Learning
[2]  
Al Bataineh A, 2018, PROC NAECON IEEE NAT, P174, DOI 10.1109/NAECON.2018.8556738
[3]  
[Anonymous], 2016, PMLR
[4]  
Anthony M., 1999, Neural Network Learning: Theoretical Foundations, P3, DOI DOI 10.1017/CBO9780511624216
[5]  
Arora R, 2018, INT C LEARNING REPRE
[6]   Clustering ensembles of neural network models [J].
Bakker, B ;
Heskes, T .
NEURAL NETWORKS, 2003, 16 (02) :261-269
[7]   UNIVERSAL APPROXIMATION BOUNDS FOR SUPERPOSITIONS OF A SIGMOIDAL FUNCTION [J].
BARRON, AR .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) :930-945
[8]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[9]   On the Complexity of Neural Network Classifiers: A Comparison Between Shallow and Deep Architectures [J].
Bianchini, Monica ;
Scarselli, Franco .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (08) :1553-1565
[10]   USING ONTOGENIC CLASSIFICATION NETWORKS IN A SMART STRUCTURES APPLICATION [J].
BURKE, L ;
FLANDERS, S .
COMPUTERS & OPERATIONS RESEARCH, 1995, 22 (09) :871-881