Neural Architecture Transfer

被引:109
作者
Lu, Zhichao [1 ]
Sreekumar, Gautam [2 ]
Goodman, Erik [2 ]
Banzhaf, Wolfgang [2 ]
Deb, Kalyanmoy [2 ]
Boddeti, Vishnu Naresh [2 ]
机构
[1] Southern Univ Sci & Technol, Shenzhen 518055, Guangdong, Peoples R China
[2] Michigan State Univ, E Lansing, MI 48824 USA
基金
美国国家科学基金会;
关键词
Convolutional neural networks; neural architecture search; AutoML; transfer learning; evolutionary algorithms;
D O I
10.1109/TPAMI.2021.3052758
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural architecture search (NAS) has emerged as a promising avenue for automatically designing task-specific neural networks. Existing NAS approaches require one complete search for each deployment specification of hardware or objective. This is a computationally impractical endeavor given the potentially large number of application scenarios. In this paper, we propose Neural Architecture Transfer (NAT) to overcome this limitation. NAT is designed to efficiently generate task-specific custom models that are competitive under multiple conflicting objectives. To realize this goal we learn task-specific supernets from which specialized subnets can be sampled without any additional training. The key to our approach is an integrated online transfer learning and many-objective evolutionary search procedure. A pre-trained supernet is iteratively adapted while simultaneously searching for task-specific subnets. We demonstrate the efficacy of NATon 11 benchmark image classification tasks ranging from large-scale multi-class to small-scale fine-grained datasets. In all cases, including ImageNet, NATNets improve upon the state-of-the-art under mobile settings (<= 600M Multiply-Adds). Surprisingly, small-scale fine-grained datasets benefit the most from NAT. At the same time, the architecture search and transfer is orders ofmagnitude more efficient than existing NASmethods. Overall, experimental evaluation indicates that, across diverse image classification tasks and computational objectives, NAT is an appreciably more effective alternative to conventional transfer learning of fine-tuning weights of an existing network architecture learned on standard datasets. Code is available at https://github.com/human-analysis/neural-architecture-transfer
引用
收藏
页码:2971 / 2989
页数:19
相关论文
共 87 条
[1]  
Baker Bowen, 2017, P 6 INT C LEARNING R
[2]  
Bender G, 2018, PR MACH LEARN RES, V80
[3]   YOLACT Real-time Instance Segmentation [J].
Bolya, Daniel ;
Zhou, Chong ;
Xiao, Fanyi ;
Lee, Yong Jae .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9156-9165
[4]  
Bossard L, 2014, LECT NOTES COMPUT SC, V8694, P446, DOI 10.1007/978-3-319-10599-4_29
[5]   MATHEMATICAL PROGRAMS WITH OPTIMIZATION PROBLEMS IN CONSTRAINTS [J].
BRACKEN, J ;
MCGILL, JT .
OPERATIONS RESEARCH, 1973, 21 (01) :37-44
[6]  
Brock Andrew, 2018, INT C LEARN REPR
[7]   COCO-Stuff: Thing and Stuff Classes in Context [J].
Caesar, Holger ;
Uijlings, Jasper ;
Ferrari, Vittorio .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1209-1218
[8]  
Cai H., 2020, P INT C LEARN REPR
[9]  
Cai Han, 2019, INT C LEARN REPR ICL
[10]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851