MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks

被引：9

作者：

Waleffe, Roger ^{[1
]}

Mohoney, Jason ^{[1
]}

Rekatsinas, Theodoros ^{[2
]}

Venkataraman, Shivaram ^{[1
]}

机构：

[1] Univ Wisconsin Madison, Madison, WI 53706 USA

[2] Swiss Fed Inst Technol, Zurich, Switzerland

来源：

PROCEEDINGS OF THE EIGHTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS, EUROSYS 2023 | 2023年

基金：

美国国家科学基金会;

关键词：

GNNs; GNN Training; Multi-hop Sampling;

D O I：

10.1145/3552326.3567501

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We study training of Graph Neural Networks (GNNs) for large-scale graphs. We revisit the premise of using distributed training for billion-scale graphs and show that for graphs that fit in main memory or the SSD of a single machine, out-of-core pipelined training with a single GPU can outperform state-of-the-art (SoTA) multi-GPU solutions. We introduce MariusGNN, the first system that utilizes the entire storage hierarchy-including disk-for GNN training. MariusGNN introduces a series of data organization and algorithmic contributions that 1) minimize the end-to-end time required for training and 2) ensure that models learned with disk-based training exhibit accuracy similar to those fully trained in memory. We evaluate MariusGNN against SoTA systems for learning GNN models and find that single-GPU training in MariusGNN achieves the same level of accuracy up to 8x faster than multi-GPU training in these systems, thus, introducing an order of magnitude monetary cost reduction. MariusGNN is open-sourced at www.marius-project.org.

引用

页码：144 / 161

页数：18

共 54 条

[1] GOSH: Embedding Big Graphs on Small Hardware [J].

Akyildiz, Taha Atahan ;

Aljundi, Amro Alabsi ;

Kaya, Kamer .

PROCEEDINGS OF THE 49TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2020, 2020,

[2]

Bordes A., 2013, ADV NEURAL INFORM PR, V26, P2787, DOI DOI 10.5555/2999792.2999923

[3]

Chami I, 2022, Arxiv, DOI [arXiv:2005.03675, 10.48550/arXiv.2005.03675]

[4]

Chen J, 2018, Arxiv, DOI arXiv:1801.10247

[5] Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks [J].

Chiang, Wei-Lin ;

Liu, Xuanqing ;

Si, Si ;

Li, Yang ;

Bengio, Samy ;

Hsieh, Cho-Jui .

KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :257-266

[6] One Trillion Edges: Graph Processing at Facebook-Scale [J].

Ching, Avery ;

Edunov, Sergey ;

Kabiljo, Maja ;

Logothetis, Dionysios ;

Muthukrishnan, Sambavi .

PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (12) :1804-1815

[7]

De Sa CM., 2020, ADV NEURAL INF PROCE, V33, P5957

[8] Global Neighbor Sampling for Mixed CPU-GPU Training on Giant Graphs [J].

Dong, Jialin ;

Zheng, Da ;

Yang, Lin F. ;

Karypis, George .

KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, :289-299

[9] Graph Neural Networks for Social Recommendation [J].

Fan, Wenqi ;

Ma, Yao ;

Li, Qing ;

He, Yuan ;

Zhao, Eric ;

Tang, Jiliang ;

Yin, Dawei .

WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, :417-426

[10]

Fey M, 2019, Arxiv, DOI arXiv:1903.02428

← 1 2 3 4 5 6 →