Few-shot Neural Architecture Search

被引:0
作者
Zhao, Yiyang [1 ]
Wang, Linnan [2 ]
Tian, Yuandong [3 ]
Fonseca, Rodrigo [2 ]
Guo, Tian [1 ]
机构
[1] Worcester Polytech Inst, Worcester, MA 01609 USA
[2] Brown Univ, Providence, RI 02912 USA
[3] Facebook AI Res, Menlo Pk, CA USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | 2021年 / 139卷
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Efficient evaluation of a network architecture drawn from a large search space remains a key challenge in Neural Architecture Search (NAS). Vanilla NAS evaluates each architecture by training from scratch, which gives the true performance but is extremely time-consuming Recently, one-shot NAS substantially reduces the computation cost by training only one supernetwork, a.k.a. supernet, to approximate the performance of every architecture in the search space via weight-sharing. However, the performance estimation can be very inaccurate due to the coadaption among operations (Bender et al., 2018). In this paper, we propose few-shot NAS that uses multiple supemetworks, called sub-supernet, each covering different regions of the search space to alleviate the undesired co-adaption. Compared to one-shot NAS, few-shot NAS improves the accuracy of architecture evaluation with a small increase of evaluation cost. With only up to 7 subsupernets, few-shot NAS establishes new SoTAs: on ImageNet, it finds models that reach 80.5% top-1 accuracy at 600 MB FLOPS and 77.5% top-1 accuracy at 238 MFLOPS; on CIFAR10, it reaches 98.72% top-1 accuracy without using extra data or transfer learning. In Auto-GAN, few-shot NAS outperforms the previously published results by up to 20%. Extensive experiments show that few-shot NAS significantly improves various one-shot methods, including 4 gradient-based and 6 search-based methods on 3 different tasks in Nas Bench-201 and NasB enchl-shot-1.
引用
收藏
页数:12
相关论文
共 60 条
[1]  
Akimoto Y., 2019, P 36 INT C MACH LEAR, P171
[2]  
[Anonymous], 2018, P MACHINE LEARNING R
[3]  
[Anonymous], 2019, ARXIV190208142
[4]  
Arora S., 2019, INT C LEARN REPR
[5]  
Baker, 2017, ICLR, P1, DOI DOI 10.48550/ARXIV.1611.02167
[6]  
Bergstra J., 2012, NIPS
[7]  
Cai H., 2019, ICLR, P1, DOI DOI 10.48550/ARXIV.1812.00332
[8]  
Cai H., 2020, INT C LEARN REPR, P1
[9]  
Carbonnelle S., 2018, ABS180601603 CORR
[10]  
Chen X., 2019, ABS191210952 CORR