You Only Search Once: Single Shot Neural Architecture Search via Direct Sparse Optimization

被引:31
作者
Zhang, Xinbang [1 ,2 ]
Huang, Zehao [3 ]
Wang, Naiyan [3 ]
Xiang, Shiming [1 ,2 ]
Pan, Chunhong [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Dept Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] Tusimple, Beijing 100020, Peoples R China
基金
中国国家自然科学基金;
关键词
Computer architecture; Optimization; Learning (artificial intelligence); Task analysis; Acceleration; Evolutionary computation; Convolution; Neural architecture search(NAS); convolution neural network; sparse optimization; NETWORKS; ALGORITHM; GAME; GO;
D O I
10.1109/TPAMI.2020.3020300
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently neural architecture search (NAS) has raised great interest in both academia and industry. However, it remains challenging because of its huge and non-continuous search space. Instead of applying evolutionary algorithm or reinforcement learning as previous works, this paper proposes a direct sparse optimization NAS (DSO-NAS) method. The motivation behind DSO-NAS is to address the task in the view of model pruning. To achieve this goal, we start from a completely connected block, and then introduce scaling factors to scale the information flow between operations. Next, sparse regularizations are imposed to prune useless connections in the architecture. Lastly, an efficient and theoretically sound optimization method is derived to solve it. Our method enjoys both advantages of differentiability and efficiency, therefore it can be directly applied to large datasets like ImageNet and tasks beyond classification. Particularly, on the CIFAR-10 dataset, DSO-NAS achieves an average test error 2.74 percent, while on the ImageNet dataset DSO-NAS achieves 25.4 percent test error under 600M FLOPs with 8 GPUs in 18 hours. As for semantic segmentation task, DSO-NAS also achieve competitive result compared with manually designed architectures on the PASCAL VOC dataset. Code is available at https://github.com/XinbangZhang/DSO-NAS.
引用
收藏
页码:2891 / 2904
页数:14
相关论文
共 81 条
  • [1] AN EVOLUTIONARY ALGORITHM THAT CONSTRUCTS RECURRENT NEURAL NETWORKS
    ANGELINE, PJ
    SAUNDERS, GM
    POLLACK, JB
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (01): : 54 - 65
  • [2] [Anonymous], P INT C LEARN REPR
  • [3] [Anonymous], P INT C LEARN REPR
  • [4] [Anonymous], 2019, P INT C LEARN REPR
  • [5] Bengio Y, 2013, INT CONF ACOUST SPEE, P8624, DOI 10.1109/ICASSP.2013.6639349
  • [6] Cai H., 2019, ICLR, P1, DOI DOI 10.48550/ARXIV.1812.00332
  • [7] Chen L.-C., 2017, 170605587 ARXIV
  • [8] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [9] Shallowing Deep Networks: Layer-wise Pruning based on Feature Representations
    Chen, Shi
    Zhao, Qi
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (12) : 3048 - 3056
  • [10] Chen T, 2015, COMPUTER SCI