Few-shot Neural Architecture Search

被引：0

作者：

Zhao, Yiyang ^{[1
]}

Wang, Linnan ^{[2
]}

Tian, Yuandong ^{[3
]}

Fonseca, Rodrigo ^{[2
]}

Guo, Tian ^{[1
]}

机构：

[1] Worcester Polytech Inst, Worcester, MA 01609 USA

[2] Brown Univ, Providence, RI 02912 USA

[3] Facebook AI Res, Menlo Pk, CA USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | 2021年 / 139卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Efficient evaluation of a network architecture drawn from a large search space remains a key challenge in Neural Architecture Search (NAS). Vanilla NAS evaluates each architecture by training from scratch, which gives the true performance but is extremely time-consuming Recently, one-shot NAS substantially reduces the computation cost by training only one supernetwork, a.k.a. supernet, to approximate the performance of every architecture in the search space via weight-sharing. However, the performance estimation can be very inaccurate due to the coadaption among operations (Bender et al., 2018). In this paper, we propose few-shot NAS that uses multiple supemetworks, called sub-supernet, each covering different regions of the search space to alleviate the undesired co-adaption. Compared to one-shot NAS, few-shot NAS improves the accuracy of architecture evaluation with a small increase of evaluation cost. With only up to 7 subsupernets, few-shot NAS establishes new SoTAs: on ImageNet, it finds models that reach 80.5% top-1 accuracy at 600 MB FLOPS and 77.5% top-1 accuracy at 238 MFLOPS; on CIFAR10, it reaches 98.72% top-1 accuracy without using extra data or transfer learning. In Auto-GAN, few-shot NAS outperforms the previously published results by up to 20%. Extensive experiments show that few-shot NAS significantly improves various one-shot methods, including 4 gradient-based and 6 search-based methods on 3 different tasks in Nas Bench-201 and NasB enchl-shot-1.

引用

页数：12

共 60 条

[1]

Akimoto Y, 2019, PR MACH LEARN RES, V97

[2]

[Anonymous], 2019, PR MACH LEARN RES

[3]

[Anonymous], 2018, PR MACH LEARN RES

[4]

Arora S., 2019, P INT C LEARN REPR

[5]

Baker B., 2017, INT C LEARNING REPRE, P1

[6]

Bergstra J., 2012, NIPS

[7]

Cai H., 2020, P INT C LEARN REPR

[8]

Cai Han, 2019, INT C LEARN REPR ICL

[9]

Carbonnelle S., 2018, ABS180601603 CORR

[10]

Chen X., 2019, ABS191210952 CORR

← 1 2 3 4 5 6 →