Activate or Not: Learning Customized Activation

被引：99

作者：

Ma, Ningning ^{[1
]}

Zhang, Xiangyu ^{[2
]}

Liu, Ming ^{[1
]}

Sun, Jian ^{[2
]}

机构：

[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China

[2] MEGVII Technol, Beijing, Peoples R China

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

关键词：

D O I：

10.1109/CVPR46437.2021.00794

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a simple, effective, and general activation function we term ACON which learns to activate the neurons or not. Interestingly, we find Swish, the recent popular NAS-searched activation, can be interpreted as a smooth approximation to ReLU. Intuitively, in the same way, we approximate the more general Maxout family to our novel ACON family, which remarkably improves the performance and makes Swish a special case of ACON. Next, we present meta-ACON, which explicitly learns to optimize the parameter switching between non-linear (activate) and linear (inactivate) and provides a new design space. By simply changing the activation function, we show its effectiveness on both small models and highly optimized large models (e.g. it. improves the ImageNet top-1 accuracy rate by 6.7% and 1.8% on MobileNet0.25 and ResNet-152, respectively). Moreover, our novel ACON can be naturally transferred to object detection and semantic segmentation, showing that ACON is an effective alternative in a variety of tasks. Code is available at https: // github.com/nmaac/acon.

引用

页码：8028 / 8038

页数：11

共 60 条

[41]

Qiu S, 2018, INT C PATT RECOG, P1223, DOI 10.1109/ICPR.2018.8546022

[42]

Ramachandran P., 2017, CoRR

[43] ImageNet Large Scale Visual Recognition Challenge [J].

Russakovsky, Olga ;

Deng, Jia ;

Su, Hao ;

Krause, Jonathan ;

Satheesh, Sanjeev ;

Ma, Sean ;

Huang, Zhiheng ;

Karpathy, Andrej ;

Khosla, Aditya ;

Bernstein, Michael ;

Berg, Alexander C. ;

Fei-Fei, Li .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 115 (03) :211-252

[44] MobileNetV2: Inverted Residuals and Linear Bottlenecks [J].

Sandler, Mark ;

Howard, Andrew ;

Zhu, Menglong ;

Zhmoginov, Andrey ;

Chen, Liang-Chieh .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4510-4520

[45]

Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556

[46]

Singh Saurabh, 2019, ARXIV191109737

[47]

Szegedy C, 2017, AAAI CONF ARTIF INTE, P4278

[48] Rethinking the Inception Architecture for Computer Vision [J].

Szegedy, Christian ;

Vanhoucke, Vincent ;

Ioffe, Sergey ;

Shlens, Jon ;

Wojna, Zbigniew .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2818-2826

[49]

Tan MX, 2019, PR MACH LEARN RES, V97

[50]

Vaswani A, 2017, ADV NEUR IN, V30

← 1 2 3 4 5 6 →