Sensitivity Analysis for Deep Learning: Ranking Hyper-parameter Influence

被引:20
作者
Taylor, Rhian [1 ]
Ojha, Varun [1 ]
Martino, Ivan [2 ]
Nicosia, Giuseppe [3 ]
机构
[1] Univ Reading, Dept Comp Sci, Reading, Berks, England
[2] KTH Royal Inst Technol, Stockholm, Sweden
[3] Univ Cambridge, Cambridge, England
来源
2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021) | 2021年
关键词
Sensitivity Analysis; Deep Learning; Hyper-parameter Tuning; Hyper-parameter rank; Hyper-parameter Influence; NEURAL-NETWORKS; UNCERTAINTY;
D O I
10.1109/ICTAI52525.2021.00083
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel approach to rank Deep Learning (DL) hyper-parameters through the application of Sensitivity Analysis (SA). DL hyper-parameter tuning is crucial to model accuracy however, choosing optimal values for each parameter is time and resource-intensive. SA provides a quantitative measure by which hyper-parameters can be ranked in terms of contribution to model accuracy. Learning rate decay was ranked highest, with model performance being sensitive to this parameter regardless of architecture or dataset. The influence of a model's initial learning rate was proven to be low, contrary to the literature. Additionally, the importance of a parameter is closely linked to model architecture. Shallower models showed susceptibility to hyper-parameters affecting the stochasticity of the learning process whereas deeper models showed sensitivity to hyper-parameters affecting the convergence speed. Furthermore, the complexity of the dataset can affect the margin of separation between the sensitivity measures of the most and the least influential parameters, making the most influential hyper-parameter an ideal candidate for tuning compared to the other parameters.
引用
收藏
页码:512 / 516
页数:5
相关论文
共 32 条
[1]  
[Anonymous], 2019, PROCEDIA MANUFACTURI, V46, P683
[2]  
Bengio Yoshua, 2012, Neural Networks: Tricks of the Trade. Second Edition: LNCS 7700, P437, DOI 10.1007/978-3-642-35289-8_26
[3]   Consciousness is not a property of states: A reply to Wilberg [J].
Berger, Jacob .
PHILOSOPHICAL PSYCHOLOGY, 2014, 27 (06) :829-842
[4]   Sensitivity analysis: A review of recent advances [J].
Borgonovo, Emanuele ;
Plischke, Elmar .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2016, 248 (03) :869-887
[5]   An effective screening design for sensitivity analysis of large models [J].
Campolongo, Francesca ;
Cariboni, Jessica ;
Saltelli, Andrea .
ENVIRONMENTAL MODELLING & SOFTWARE, 2007, 22 (10) :1509-1518
[6]  
Chollet F., 2015, KERAS 20 COMPUTER SO
[7]   NON-LINEAR SENSITIVITY ANALYSIS OF MULTI-PARAMETER MODEL SYSTEMS [J].
CUKIER, RI ;
LEVINE, HB ;
SHULER, KE .
JOURNAL OF COMPUTATIONAL PHYSICS, 1978, 26 (01) :1-42
[8]  
Domhan T, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P3460
[9]   Sensitivity Analysis of the Neural Networks Randomized Learning [J].
Dudek, Grzegorz .
ARTIFICIAL INTELLIGENCEAND SOFT COMPUTING, PT I, 2019, 11508 :51-61
[10]   Two-way interaction of input variables in the sensitivity analysis of neural network models [J].
Gevrey, Muriel ;
Dimopoulos, Ioannis ;
Lek, Sovan .
ECOLOGICAL MODELLING, 2006, 195 (1-2) :43-50