Activate or Not: Learning Customized Activation

被引：89

作者：

Ma, Ningning ^{[1
]}

Zhang, Xiangyu ^{[2
]}

Liu, Ming ^{[1
]}

Sun, Jian ^{[2
]}

机构：

[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China

[2] MEGVII Technol, Beijing, Peoples R China

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

关键词：

D O I：

10.1109/CVPR46437.2021.00794

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a simple, effective, and general activation function we term ACON which learns to activate the neurons or not. Interestingly, we find Swish, the recent popular NAS-searched activation, can be interpreted as a smooth approximation to ReLU. Intuitively, in the same way, we approximate the more general Maxout family to our novel ACON family, which remarkably improves the performance and makes Swish a special case of ACON. Next, we present meta-ACON, which explicitly learns to optimize the parameter switching between non-linear (activate) and linear (inactivate) and provides a new design space. By simply changing the activation function, we show its effectiveness on both small models and highly optimized large models (e.g. it. improves the ImageNet top-1 accuracy rate by 6.7% and 1.8% on MobileNet0.25 and ResNet-152, respectively). Moreover, our novel ACON can be naturally transferred to object detection and semantic segmentation, showing that ACON is an effective alternative in a variety of tasks. Code is available at https: // github.com/nmaac/acon.

引用

页码：8028 / 8038

页数：11

共 60 条

[1]

Agostinelli F., 2014, arXiv preprint arXiv:1412.6830

[2]

[Anonymous], 2019, ADV NEURAL INFORM PR

[3]

[Anonymous], 2015, IEEE I CONF COMP VIS, DOI DOI 10.1109/ICCV.2015.123

[4]

[Anonymous], 2015, PROCIEEE CONFCOMPUT

[5]

[Anonymous], 2001, NEURIPS

[6]

Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473

[7]

Boyd S., 2006, IEEE T AUTOMAT CONTR, V51, P1859, DOI DOI 10.1109/TAC.2006.884922

[8]

Chen Y., 2020, COMPUTER VISION ECCV, P351

[9] Xception: Deep Learning with Depthwise Separable Convolutions [J].

Chollet, Francois .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807

[10]

Clevert D.-A., 2016, 4 INT C LEARN REPR I

← 1 2 3 4 5 6 →