Building High-Throughput Neural Architecture Search Workflows via a Decoupled Fitness Prediction Engine

被引：5

作者：

Keller Rorabaugh, Ariel ^{[1
]}

Caino-Lores, Silvina ^{[1
]}

Johnston, Travis ^{[2
]}

Taufer, Michela ^{[1
]}

机构：

[1] Univ Tennessee Knoxville, Knoxville, TN 37996 USA

[2] Striveworks, Austin, TX 78731 USA

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 2022年 / 33卷 / 11期

基金：

美国国家科学基金会;

关键词：

Training; Artificial neural networks; Predictive models; Parametric statistics; Engines; Search problems; Data models; Machine learning; artificial intelligence; performance prediction; neural networks;

D O I：

10.1109/TPDS.2022.3140681

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Neural networks (NN) are used in high-performance computing and high-throughput analysis to extract knowledge from datasets. Neural architecture search (NAS) automates NN design by generating, training, and analyzing thousands of NNs. However, NAS requires massive computational power for NN training. To address challenges of efficiency and scalability, we propose PENGUIN, a decoupled fitness prediction engine that informs the search without interfering in it. PENGUIN uses parametric modeling to predict fitness of NNs. Existing NAS methods and parametric modeling functions can be plugged into PENGUIN to build flexible NAS workflows. Through this decoupling and flexible parametric modeling, PENGUIN reduces training costs: it predicts the fitness of NNs, enabling NAS to terminate training NNs early. Early termination increases the number of NNs that fixed compute resources can evaluate, thus giving NAS additional opportunity to find better NNs. We assess the effectiveness of our engine on 6,000 NNs across three diverse benchmark datasets and three state of the art NAS implementations using the Summit supercomputer. Augmenting these NAS implementations with PENGUIN can increase throughput by a factor of 1.6 to 7.1. Furthermore, walltime tests indicate that PENGUIN can reduce training time by a factor of 2.5 to 5.3.

引用

页码：2913 / 2926

页数：14

共 55 条

[1]

[Anonymous], 2016, PROC INT JOINT C ART

[2]

[Anonymous], 2019, IEEE IJCNN

[3]

[Anonymous], 2018, Cinic-10 is not imagenet or cifar-10

[4] DENSER: deep evolutionary network structured representation [J].

Assuncao, Filipe ;

Lourenco, Nuno ;

Machado, Penousal ;

Ribeiro, Bernardete .

GENETIC PROGRAMMING AND EVOLVABLE MACHINES, 2019, 20 (01) :5-35

[5]

Baker B., 2017, PROC NIPS WORKSHOP M

[6] Do We Train on Test Data? Purging CIFAR of Near-Duplicates [J].

Barz, Bjoern ;

Denzler, Joachim .

JOURNAL OF IMAGING, 2020, 6 (06)

[7]

Bhattacharjee S, 2017, IEEE CONF COMM NETW, P200

[8] ON THE PREDICTIVE POWER OF META-FEATURES IN OPENML [J].

Bilalli, Besim ;

Abello, Alberto ;

Aluja-Banet, Tomas .

INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2017, 27 (04) :697-712

[9]

Cai H., 2018, P INT C LEARN REPR, P1

[10] How Big Data and High-Performance Computing Drive Brain Science [J].

Chen, Shanyu ;

He, Zhipeng ;

Han, Xinyin ;

He, Xiaoyu ;

Li, Ruilin ;

Zhu, Haidong ;

Zhao, Dan ;

Dai, Chuangchuang ;

Zhang, Yu ;

Lu, Zhonghua ;

Chi, Xuebin ;

Niu, Beifang .

GENOMICS PROTEOMICS & BIOINFORMATICS, 2019, 17 (04) :381-392

← 1 2 3 4 5 6 →