Continuously evolving dropout with multi-objective evolutionary optimisation

被引:5
作者
Jiang, Pengcheng [1 ]
Xue, Yu [1 ]
Neri, Ferrante [2 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Comp Sci, Nanjing 210044, Peoples R China
[2] Univ Surrey, Dept Comp Sci, NICE Grp, Guildford GU2 7XH, England
关键词
Genetic algorithms; Multi-objective optimisation; Deep neural networks; Over-fitting; Dropout; NEURAL-NETWORKS; ALGORITHM;
D O I
10.1016/j.engappai.2023.106504
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dropout is an effective method of mitigating over-fitting while training deep neural networks (DNNs). This method consists of switching off (dropping) some of the neurons of the DNN and training it by keeping the remaining neurons active. This approach makes the DNN general and resilient to changes in its inputs. However, the probability of a neuron belonging to a layer to be dropped, the 'dropout rate', is a hard-to-tune parameter that affects the performance of the trained model. Moreover, there is no reason, besides being more practical during parameter tuning, why the dropout rate should be the same for all neurons across a layer. This paper proposes a novel method to guide the dropout rate based on an evolutionary algorithm. In contrast to previous studies, we associate a dropout with each individual neuron of the network, thus allowing more flexibility in the training phase. The vector encoding the dropouts for the entire network is interpreted as the candidate solution of a bi-objective optimisation problem, where the first objective is the error reduction due to a set of dropout rates for a given data batch, while the second objective is the distance of the used dropout rates from a pre-arranged constant. The second objective is used to control the dropout rates and prevent them from becoming too small, hence ineffective; or too large, thereby dropping a too-large portion of the network. Experimental results show that the proposed method, namely GADropout, produces DNNs that consistently outperform DNNs designed by other dropout methods, some of them being modern advanced dropout methods representing the state-of-the-art. GADroput has been tested on multiple datasets and network architectures.
引用
收藏
页数:10
相关论文
共 62 条
  • [1] Information Dropout: Learning Optimal Representations Through Noisy Computation
    Achille, Alessandro
    Soatto, Stefano
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (12) : 2897 - 2905
  • [2] Ba J., 2013, ADV NEURAL INFORM PR, V26
  • [3] Evolving hybrid ensembles of learning machines for better generalisation
    Chandra, A
    Yao, X
    [J]. NEUROCOMPUTING, 2006, 69 (7-9) : 686 - 700
  • [4] Chen TH, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), P814, DOI 10.1109/ROBIO.2018.8665324
  • [5] Attention-Based Dropout Layer for Weakly Supervised Single Object Localization and Semantic Segmentation
    Choe, Junsuk
    Lee, Seungho
    Shim, Hyunjung
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) : 4256 - 4271
  • [6] A fast and elitist multiobjective genetic algorithm: NSGA-II
    Deb, K
    Pratap, A
    Agarwal, S
    Meyarivan, T
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2002, 6 (02) : 182 - 197
  • [7] Omni-optimizer: A generic evolutionary algorithm for single and multi-objective optimization
    Deb, Kalyanmoy
    Tiwari, Santosh
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2008, 185 (03) : 1062 - 1087
  • [8] DeVries T, 2017, Arxiv, DOI arXiv:1708.04552
  • [9] Eiben A. E., 2015, Introduction to Evolutionary Computing, V2nd
  • [10] Gal Y, 2016, ADV NEUR IN, V29