You Only Search Once: Single Shot Neural Architecture Search via Direct Sparse Optimization

被引：31

作者：

Zhang, Xinbang ^{[1
,2
]}

Huang, Zehao ^{[3
]}

Wang, Naiyan ^{[3
]}

Xiang, Shiming ^{[1
,2
]}

Pan, Chunhong ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Automat, Dept Natl Lab Pattern Recognit, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China

[3] Tusimple, Beijing 100020, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2021年 / 43卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Computer architecture; Optimization; Learning (artificial intelligence); Task analysis; Acceleration; Evolutionary computation; Convolution; Neural architecture search(NAS); convolution neural network; sparse optimization; NETWORKS; ALGORITHM; GAME; GO;

D O I：

10.1109/TPAMI.2020.3020300

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently neural architecture search (NAS) has raised great interest in both academia and industry. However, it remains challenging because of its huge and non-continuous search space. Instead of applying evolutionary algorithm or reinforcement learning as previous works, this paper proposes a direct sparse optimization NAS (DSO-NAS) method. The motivation behind DSO-NAS is to address the task in the view of model pruning. To achieve this goal, we start from a completely connected block, and then introduce scaling factors to scale the information flow between operations. Next, sparse regularizations are imposed to prune useless connections in the architecture. Lastly, an efficient and theoretically sound optimization method is derived to solve it. Our method enjoys both advantages of differentiability and efficiency, therefore it can be directly applied to large datasets like ImageNet and tasks beyond classification. Particularly, on the CIFAR-10 dataset, DSO-NAS achieves an average test error 2.74 percent, while on the ImageNet dataset DSO-NAS achieves 25.4 percent test error under 600M FLOPs with 8 GPUs in 18 hours. As for semantic segmentation task, DSO-NAS also achieve competitive result compared with manually designed architectures on the PASCAL VOC dataset. Code is available at https://github.com/XinbangZhang/DSO-NAS.

引用

页码：2891 / 2904

页数：14

共 81 条

[1] AN EVOLUTIONARY ALGORITHM THAT CONSTRUCTS RECURRENT NEURAL NETWORKS
ANGELINE, PJ
SAUNDERS, GM
POLLACK, JB
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (01): : 54 - 65
[2] [Anonymous], P INT C LEARN REPR
[3] [Anonymous], P INT C LEARN REPR
[4] [Anonymous], 2019, P INT C LEARN REPR
[5] Bengio Y, 2013, INT CONF ACOUST SPEE, P8624, DOI 10.1109/ICASSP.2013.6639349
[6] Cai H., 2019, ICLR, P1, DOI DOI 10.48550/ARXIV.1812.00332
[7] Chen L.-C., 2017, 170605587 ARXIV
[8] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[9] Shallowing Deep Networks: Layer-wise Pruning based on Feature Representations
Chen, Shi
Zhao, Qi
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (12) : 3048 - 3056
[10] Chen T, 2015, COMPUTER SCI

← 1 2 3 4 5 6 7 8 9 →