Holistic deep learning

被引:3
作者
Bertsimas, Dimitris [1 ,2 ]
Carballo, Kimberly Villalobos [2 ]
Boussioux, Leonard [2 ]
Li, Michael Lingzhi [3 ]
Paskov, Alex [2 ]
Paskov, Ivan [2 ]
机构
[1] MIT, Sloan Sch Management, Cambridge, MA 02139 USA
[2] MIT, Operat Res Ctr, Cambridge, MA 02139 USA
[3] Harvard Sch Business, Technol & Operat Management, Boston, MA 02163 USA
关键词
Deep learning; Optimization; Robustness; Sparsity; Stability; Regularization;
D O I
10.1007/s10994-023-06482-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel holistic deep learning framework that simultaneously addresses the challenges of vulnerability to input perturbations, overparametrization, and performance instability from different train-validation splits. The proposed framework holistically improves accuracy, robustness, sparsity, and stability over standard deep learning models, as demonstrated by extensive experiments on both tabular and image data sets. The results are further validated by ablation experiments and SHAP value analysis, which reveal the interactions and trade-offs between the different evaluation metrics. To support practitioners applying our framework, we provide a prescriptive approach that offers recommendations for selecting an appropriate training loss function based on their specific objectives. All the code to reproduce the results can be found at https://github.com/kimvc7/HDL.
引用
收藏
页码:159 / 183
页数:25
相关论文
共 58 条
[1]  
Abadi M., 2015, TensorFlow: Large-scale machine learning on heterogeneous systems
[2]   Fast Convex Pruning of Deep Neural Networks [J].
Aghasi, Alireza ;
Abdi, Afshin ;
Romberg, Justin .
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2020, 2 (01) :158-188
[3]   Optimal policy trees [J].
Amram, Maxime ;
Dunn, Jack ;
Zhuo, Ying Daisy .
MACHINE LEARNING, 2022, 111 (07) :2741-2768
[4]   Strong mixed-integer programming formulations for trained neural networks [J].
Anderson, Ross ;
Huchette, Joey ;
Ma, Will ;
Tjandraatmadja, Christian ;
Vielma, Juan Pablo .
MATHEMATICAL PROGRAMMING, 2020, 183 (1-2) :3-39
[5]  
Athalye A, 2018, Arxiv, DOI [arXiv:1802.00420, DOI 10.48550/ARXIV.1802.00420]
[6]  
Bellec G, 2018, Arxiv, DOI arXiv:1711.05136
[7]  
Bertsimas D., 2022, J. Mach. Learn. Res., V23, P1
[8]  
Bertsimas D, 2021, Arxiv, DOI arXiv:2112.09279
[9]  
Bertsimas D, 2020, J MACH LEARN RES, V21
[10]  
Bertsimas D, 2020, STAT SCI, V35, P555, DOI 10.1214/19-STS701