JUWELS Booster - A Supercomputer for Large-Scale AI Research

被引:3
作者
Kesselheim, Stefan [1 ]
Herten, Andreas [1 ]
Krajsek, Kai [1 ]
Ebert, Jan [1 ]
Jitsev, Jenia [1 ]
Cherti, Mehdi [1 ]
Langguth, Michael [1 ]
Gong, Bing [1 ]
Stadtler, Scarlet [1 ]
Mozaffari, Amirpasha [1 ]
Cavallaro, Gabriele [1 ]
Sedona, Rocco [1 ,2 ]
Schug, Alexander [1 ,3 ]
Strube, Alexandre [1 ]
Kamath, Roshni [1 ]
Schultz, Martin G. [1 ]
Riedel, Morris [1 ,2 ]
Lippert, Thomas [1 ]
机构
[1] Forschungszentrum Julich, Julich Supercomp Ctr, Julich, Germany
[2] Univ Iceland, Sch Engn & Nat Sci, Reykjavik, Iceland
[3] Univ Duisburg Essen, Duisburg, Germany
来源
HIGH PERFORMANCE COMPUTING - ISC HIGH PERFORMANCE DIGITAL 2021 INTERNATIONAL WORKSHOPS | 2021年 / 12761卷
基金
欧洲研究理事会; 欧盟地平线“2020”;
关键词
Artificial Intelligence; Deep learning; GPU; High-performance computing; Machine learning; Mich supercomputing centre; DIRECT-COUPLING ANALYSIS; RNA; IDENTIFICATION;
D O I
10.1007/978-3-030-90539-2_31
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we present JUWELS Booster, a recently commissioned high-performance computing system at the Julich Supercomputing Center. With its system architecture, most importantly its large number of powerful Graphics Processing Units (GPUs) and its fast interconnect via InfiniBand, it is an ideal machine for large-scale Artificial Intelligence (AI) research and applications. We detail its system architecture, parallel, distributed model training, and benchmarks indicating its outstanding performance. We exemplify its potential for research application by presenting large-scale AI research highlights from various scientific fields that require such a facility.
引用
收藏
页码:453 / 468
页数:16
相关论文
共 69 条
  • [31] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
  • [32] Kurth Thorsten, 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. Proceedings, P649, DOI 10.1109/SC.2018.00054
  • [33] Laanait N, 2019, ARXIV PREPRINT ARXIV
  • [34] Lee Alex X., 2018, ARXIV180401523
  • [35] Lee Stefan, 2015, arXiv preprint arXiv:1511.06314
  • [36] Liu H., 2017, ARXIV E PRINTS ARXIV
  • [37] Hyper-Parameter Selection in Deep Neural Networks Using Parallel Particle Swarm Optimization
    Lorenzo, Pablo Ribalta
    Nalepa, Jakub
    Sanchez Ramos, Luciano
    Ranilla Pastor, Jose
    [J]. PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCO'17 COMPANION), 2017, : 1864 - 1871
  • [38] MLPerf: An Industry Standard Benchmark Suite for Machine Learning Performance
    Mattson, Peter
    Tang, Hanlin
    Wei, Gu-Yeon
    Wu, Carole-Jean
    Reddi, Vijay Janapa
    Cheng, Christine
    Coleman, Cody
    Diamos, Greg
    Kanter, David
    Micikevicius, Paulius
    Patterson, David
    Schmuelling, Guenther
    [J]. IEEE MICRO, 2020, 40 (02) : 8 - 16
  • [39] Message Passing Interface Forum, 2021, MPI: A message-passing interface standard
  • [40] MULLER UA, 1994, 1994 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOL 1-7, P3961, DOI 10.1109/ICNN.1994.374845