Discretization-aware architecture search

被引：18

作者：

Tian, Yunjie ^{[1
]}

Liu, Chang ^{[1
]}

Xie, Lingxi ^{[2
]}

Jiao, Jianbin ^{[1
]}

Ye, Qixiang ^{[1
]}

机构：

[1] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

[2] Huawei Inc, Noahs Ark Lab, Beijing, Peoples R China

来源：

PATTERN RECOGNITION | 2021年 / 120卷

关键词：

Neural architecture search; Weight-sharing; Discretization-aware; Imbalanced network configuration;

D O I：

10.1016/j.patcog.2021.108186

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The search cost of neural architecture search (NAS) has been largely reduced by differentiable architecture search and weight-sharing methods. Such methods optimize a super-network with all possible edges and operations, and determine the optimal sub-network by discretization, i.e., pruning off operations/edges of small weights. However, the discretization process performed on either operations or edges incurs significant inaccuracy and thus the quality of the architecture is not guaranteed. In this paper, we propose discretization-aware architecture search (DA(2)S), and target at pushing the super-network towards the configuration of desired topology. DA(2)S is implemented with an entropy-based loss term, which can be regularized to differentiable architecture search in a plug-and-play fashion. The regularization is controlled by elaborated continuation functions, so that discretization is adaptive to the dynamic change of edges and operations. Experiments on standard image classification benchmarks demonstrate the effectiveness of our approach, in particular, under imbalanced network configurations that were not studied before. (C) 2021 Elsevier Ltd. All rights reserved.

引用

页数：12

共 56 条

[1]

[Anonymous], abs/1706.02677

[2]

[Anonymous], 2014, Comput. Sci.

[3]

[Anonymous], 2019, ICML

[4]

[Anonymous], 2016, ADV NEURAL INFORM PR, DOI [DOI 10.1145/3065386, 10.1145/3065386, DOI 10.2165/00129785-200404040-00005]

[5]

[Anonymous], 2019, INT C MACH LEARN ICM

[6] Can weight sharing outperform random architecture search? An investigation with TuNAS [J].

Bender, Gabriel ;

Liu, Hanxiao ;

Chen, Bo ;

Chu, Grace ;

Cheng, Shuyang ;

Kindermans, Pieter-Jan ;

Le, Quoc .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :14311-14320

[7]

Brock Andrew, 2018, INT C LEARN REPR

[8]

Cai H., 2019, INT C LEAR REPR ICLR

[9]

Cai H, 2018, AAAI CONF ARTIF INTE, P2787

[10] Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild [J].

Chen, Xin ;

Xie, Lingxi ;

Wu, Jun ;

Tian, Qi .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (03) :638-655

← 1 2 3 4 5 6 →