Multi-Objective Neural Architecture Search by Learning Search Space Partitions

被引：0

作者：

Zhao, Yiyang ^{[1
]}

Wang, Linnan ^{[2
]}

Guo, Tian ^{[1
]}

机构：

[1] Worcester Polytech Inst, Worcester, MA 01609 USA

[2] Brown Univ, Providence, RI USA

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2024年 / 25卷

基金：

美国国家科学基金会;

关键词：

Neural Architecture Search; Monte Carlo Tree Search; AutoML; Deep Learning; EVOLUTIONARY ALGORITHMS; OPTIMIZATION; DIVERSITY; NETWORK;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deploying deep learning models requires taking into consideration neural network metrics such as model size, inference latency, and #FLOPs, aside from inference accuracy. This results in deep learning model designers leveraging multi -objective optimization to design effective deep neural networks in multiple criteria. However, applying multi -objective optimizations to neural architecture search (NAS) is nontrivial because NAS tasks usually have a huge search space, along with a non -negligible searching cost. This requires effective multi -objective search algorithms to alleviate the GPU costs. In this work, we implement a novel multi -objectives optimizer based on a recently proposed meta -algorithm called LaMOO Zhao et al. (2022) on NAS tasks. In a nutshell, LaMOO speedups the search process by learning a model from observed samples to partition the search space and then focusing on promising regions likely to contain a subset of the Pareto frontier. Using LaMOO , we observe an improvement of more than 200% sample efficiency compared to Bayesian optimization and evolutionary -based multi -objective optimizers on different NAS datasets. For example, when combined with LaMOO , qEHVI achieves a 225% improvement in sample efficiency compared to using qEHVI alone in NasBench201. For real -world tasks, LaMOO achieves 97.36% accuracy with only 1.62M #Params on CIFAR10 in only 600 search samples. On ImageNet, our large model reaches 80.4% top -1 accuracy with only 522M #FLOPs.

引用

页数：41

共 94 条

[61]

Tan MX, 2019, Arxiv, DOI arXiv:1907.09595

[62]

Tan MX, 2019, PROC CVPR IEEE, P2815, DOI [arXiv:1807.11626, 10.1109/CVPR.2019.00293]

[63] FCOS: Fully Convolutional One-Stage Object Detection [J].

Tian, Zhi ;

Shen, Chunhua ;

Chen, Hao ;

He, Tong .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9626-9635

[64] Automatic Database Management System Tuning Through Large-scale Machine Learning [J].

Van Aken, Dana ;

Pavlo, Andrew ;

Gordon, Geoffrey J. ;

Zhang, Bohan .

SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2017, :1009-1024

[65]

Van Veldhuizen D. A., 1998, P LAT BREAK GEN PROG, P221

[66] FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions [J].

Wan, Alvin ;

Dai, Xiaoliang ;

Zhang, Peizhao ;

He, Zijian ;

Tian, Yuandong ;

Xie, Saining ;

Wu, Bichen ;

Yu, Matthew ;

Xu, Tao ;

Chen, Kan ;

Vajda, Peter ;

Gonzalez, Joseph E. .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12962-12971

[67]

Wang L., 2019, arXiv

[68] Searching the Deployable Convolution Neural Networks for GPUs [J].

Wang, Linnan ;

Yu, Chenhan ;

Salian, Satish ;

Kierat, Slawomir ;

Migacz, Szymon ;

Florea, Alex Fit .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :12217-12226

[69]

Wang LN, 2019, Arxiv, DOI arXiv:1903.11059

[70]

Wang Linnan, 2020, ADV NEURAL INFORM PR, V33

← 1 2 3 4 5 6 7 8 9 10 →