Zero-Shot Neural Architecture Search: Challenges, Solutions, and Opportunities

被引:1
作者
Li, Guihong [1 ]
Hoang, Duc [1 ]
Bhardwaj, Kartikeya [2 ]
Lin, Ming [3 ]
Wang, Zhangyang [1 ]
Marculescu, Radu [1 ]
机构
[1] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
[2] Qualcomm Technol Inc, Qualcomm AI Res, San Diego, CA 92121 USA
[3] Amazon, Seattle, WA 98004 USA
关键词
Neural architecture search; zero-shot proxy; hardware-aware neural network design; CONVERGENCE; NETWORKS;
D O I
10.1109/TPAMI.2024.3395423
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, zero-shot (or training-free) Neural Architecture Search (NAS) approaches have been proposed to liberate NAS from the expensive training process. The key idea behind zero-shot NAS approaches is to design proxies that can predict the accuracy of some given networks without training the network parameters. The proxies proposed so far are usually inspired by recent progress in theoretical understanding of deep learning and have shown great potential on several datasets and NAS benchmarks. This paper aims to comprehensively review and compare the state-of-the-art (SOTA) zero-shot NAS approaches, with an emphasis on their hardware awareness. To this end, we first review the mainstream zero-shot proxies and discuss their theoretical underpinnings. We then compare these zero-shot proxies through large-scale experiments and demonstrate their effectiveness in both hardware-aware and hardware-oblivious NAS scenarios. Finally, we point out several promising ideas to design better proxies.
引用
收藏
页码:7618 / 7635
页数:18
相关论文
共 117 条
  • [1] Abdelfattah M. S, 2021, P INT C LEARN REPR I, P1
  • [2] Allen-Zhu Z, 2019, ADV NEUR IN, V32
  • [3] [Anonymous], 2017, PROC ADV NEURAL INFO
  • [4] Brown TB, 2020, Arxiv, DOI [arXiv:2005.14165, 10.48550/arXiv.2005.14165]
  • [5] Baker B, 2017, Arxiv, DOI arXiv:1611.02167
  • [6] Benmeziane H, 2021, PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, P4322
  • [7] Bhardwaj K, 2022, Arxiv, DOI arXiv:2208.08562
  • [8] How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?
    Bhardwaj, Kartikeya
    Li, Guihong
    Marculescu, Radu
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13493 - 13502
  • [9] Cai H., 2019, P INT C LEARN REPR, P1
  • [10] Cai H., 2020, INT C LEARN REPR