Deep learning with ExtendeD Exponential Linear Unit (DELU)

被引:2
作者
Catalbas, Burak [1 ]
Morgul, Oemer [1 ]
机构
[1] Bilkent Univ, Dept Elect & Elect Engn, TR-06800 Ankara, Turkiye
关键词
Artificial neural networks; Activation functions; Classification; Image segmentation;
D O I
10.1007/s00521-023-08932-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Activation functions are crucial parts of artificial neural networks. From the first perceptron created artificially up to today, many functions are proposed. Some of them are currently in common use, such as Rectified Linear Unit (ReLU) and Exponential Linear Unit (ELU) and other ReLU variants. In this article we propose a novel activation function, called ExtendeD Exponential Linear Unit (DELU). After its introduction and presenting its basic properties, by making various simulations with different datasets and architectures, we show that it may perform better than other activation functions in certain cases. While also inheriting most of the good properties of ReLU and ELU, DELU offers an increase of success in comparison with them by slowing the alignment of neurons in early stages of training process. In experiments, DELU performed better than other activation functions in general, for Fashion MNIST, CIFAR-10 and CIFAR-100 classification tasks with different sized Residual Neural Networks (ResNet). Specifically, DELU managed to reduce the error rate by sufficiently high confidence levels in CIFAR datasets in comparison with ReLU and ELU networks. In addition, DELU is compared in an image segmentation example as well. Also, compatibility of DELU is tested with different initializations, and statistical methods are employed to verify these success rates by using Z-score analysis, which may be considered as a different view of success assessment in neural networks.
引用
收藏
页码:22705 / 22724
页数:20
相关论文
共 43 条
[1]   P plus FELU: Flexible and trainable fast exponential linear unit for deep learning architectures [J].
Adem, Kemal .
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (24) :21729-21740
[2]  
Adler Andy, 2010, Journal of Physics: Conference Series, V224, DOI 10.1088/1742-6596/224/1/012056
[3]  
Alcaide E, 2018, Arxiv, DOI [arXiv:1801.07145, 10.48550/arXiv.1801.07145]
[4]   Brain tumor classification in magnetic resonance image using hard swish-based RELU activation function-convolutional neural network [J].
Alhassan, Afnan M. ;
Zainon, Wan Mohd Nazmee Wan .
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (15) :9075-9087
[5]   Improving the Performance of Deep Neural Networks Using Two Proposed Activation Functions [J].
Alkhouly, Asmaa A. ;
Mohammed, Ammar ;
Hefny, Hesham A. .
IEEE ACCESS, 2021, 9 :82249-82271
[6]  
[Anonymous], 2009, CIFAR-100 Dataset
[7]  
[Anonymous], 2012, CVPR
[8]   A survey on modern trainable activation functions [J].
Apicella, Andrea ;
Donnarumma, Francesco ;
Isgro, Francesco ;
Prevete, Roberto .
NEURAL NETWORKS, 2021, 138 :14-32
[9]  
Billingsley P., 1995, PROBABILITY MEASURE
[10]  
Catalbas B, 2022, THESIS