NAP: Neural architecture search with pruning

被引:23
作者
Ding, Yadong [1 ]
Wu, Yu [2 ]
Huang, Chengyue [1 ]
Tang, Siliang [1 ]
Wu, Fei [1 ]
Yang, Yi [2 ]
Zhu, Wenwu [3 ]
Zhuang, Yueting [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Zhejiang, Peoples R China
[2] Univ Technol Sydney, Australian Artificial Intelligence Inst, ReLER Lab, Sydney, NSW, Australia
[3] Tsing Univ, Tsinghua Berkeley Shenzhen Inst, Dept Comp Sci & Technol,Beijing Key Lab Networked, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
关键词
Neural architecture search; Network pruning; Computer vision;
D O I
10.1016/j.neucom.2021.12.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There has been continuously increasing attention attracted by Neural Architecture Search (NAS). Due to its computational efficiency, gradient-based NAS methods like DARTS have become the most popular framework for NAS tasks. Nevertheless, as the search iterates, the derived model in previous NAS frame-works becomes dominated by skip-connects, causing the performance downfall. In this work, we present a novel approach to alleviate this issue, named Neural Architecture search with Pruning (NAP). Unlike prior differentiable architecture search works, our approach draws the idea from network pruning. We first train an over-parameterized network, including all candidate operations. Then we propose a criterion to prune the network. Based on a newly designed relaxation of architecture representation, NAP can derive the most potent model by removing trivial and redundant edges from the whole network topol-ogy. Experiments show the effectiveness of our proposed approach. Specifically, the model searched by NAP achieves state-of-the-art performances (2.48% test error) on CIFAR-10. We transfer the model to ImageNet and obtains a 25.1% test error with only 5.0 M parameters, which is on par with modern NAS methods. (c) 2021 Published by Elsevier B.V.
引用
收藏
页码:85 / 95
页数:11
相关论文
共 68 条
[1]  
[Anonymous], 2020, IEEE T IMAGE PROCESS
[2]  
[Anonymous], 2009, Rep. TR-2009
[3]  
[Anonymous], 1992, Adv. Neural Inf. Process. Syst.
[4]  
[Anonymous], 2018, ARXIV180710585
[5]  
[Anonymous], 2017, P INT C LEARN REPR T
[6]  
[Anonymous], 2015, P BRIT MACH VIS C 20, DOI DOI 10.5244/C.29.31
[7]  
Cai H, 2018, INT C LEARN REPR
[8]   Estimating Depth From Monocular Images as Classification Using Deep Fully Convolutional Residual Networks [J].
Cao, Yuanzhouhan ;
Wu, Zifeng ;
Shen, Chunhua .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (11) :3174-3182
[9]  
Chang J., 2020, IEEE T PATTERN ANAL
[10]   Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation [J].
Chen, Xin ;
Xie, Lingxi ;
Wu, Jun ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1294-1303