Multi-GPU UNRES for scalable coarse-grained simulations of very large protein systems

被引:2
作者
Ocetkiewicz, Krzysztof M. [1 ]
Czaplewski, Cezary [1 ,2 ,3 ]
Krawczyk, Henryk [1 ,4 ]
Lipska, Agnieszka G. [1 ]
Liwo, Adam [1 ]
Proficz, Jerzy [1 ]
Sieradzan, Adam K. [1 ,2 ,3 ]
Czarnul, Pawel [4 ]
机构
[1] Gdansk Univ Technol, Fahrenheit Union Univ Gdansk, Ctr Informat Tricity Acad Supercomp & Network CI T, Narutowicza 11-12, PL-80233 Gdansk, Poland
[2] Univ Gdansk, Fahrenheit Union Univ Gdansk, Fac Chem, Wita Stwosza 63, PL-80308 Gdansk, Poland
[3] Korea Inst Adv Study, Sch Computat Sci, Seoul 02455, South Korea
[4] Gdansk Univ Technol, Fahrenheit Union Univ Gdansk, Fac Elect Telecommun & Informat, Narutowicza 11-12, PL-80233 Gdansk, Poland
关键词
Multi-GPU scalability; UNRES; Coarse graining; Protein dynamics; High performance computing; MOLECULAR-DYNAMICS SIMULATIONS; UNITED-RESIDUE MODEL; FORCE-FIELD; POLYPEPTIDE-CHAINS; FOLDING PATHWAYS; ALL-ATOM; TESTS; AGGREGATION; ALPHA; TEMPERATURE;
D O I
10.1016/j.cpc.2024.109112
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Graphical Processor Units (GPUs) are nowadays widely used in all -atom molecular simulations because of the advantage of efficient partitioning of atom pairs between the kernels to compute the contributions to energy and forces, thus enabling the treatment of very large systems. Extension of time- and size -scale of computations is also sought through the development of coarse -grained (CG) models, in which atoms are merged into extended interaction sites. Implementation of CG codes on the GPUs, particularly the multiple-GPU platforms is, however, a challenge due to more complicated potentials and removing the explicit solvent, forcing developers to do interaction- rather than space -domain decomposition. In this paper, we propose a design of a multi-GPU coarsegrained simulator and report the implementation of the heavily coarse -grained physics -based UNited RESidue (UNRES) model of polypeptide chains. By moving all computations to GPUs and keeping the communication with CPUs to a minimum, we managed to achieve almost 5 -fold speed-up with 8 A100 GPU accelerators for systems with over 200,000 amino -acid residues, this result making UNRES the best scalable coarse -grained software and enabling us to do laboratory -time millisecond -scale simulations of such cell components as tubulin within days of wall -clock time. Program summary Program Title: Multi-GPU UNRES CPC Library link to program files: https://doi .org /10 .17632 /hz9s4nwncf .1 Developer's repository link: https://projects .task .gda .pl /eurohpcpl -public /unres Licensing provisions: GPLv3 Programming language: Fortran + C++/CUDA Nature of problem: Physics -based simulations of protein systems at biologically relevant time- and size -scale are demanding and consequently require both the simplification of biomolecule representation and substantial computational resources. UNRES (from UNited RESidue) is a physics -based reduced model of polypeptide chains with which to run large-scale coarse -grained simulations of protein structure and dynamics. It enables the researchers to study protein folding, protein dynamics, and protein -protein interactions in a physically realistic manner and further unveil biological processes' mechanisms. Examples of biological applications include studies of amyloid formations, signaling mechanism, and action of molecular chaperones. Solution method: The presented Multi-GPU UNRES relies on a highly optimized GPU implementation of noncentral forces using modern CUDA constructs. Fundamentally, it is possible by proposed efficient partitioning and assignment of the interaction domain onto GPU resources. We moved as many computations as possible to the device (GPU) side. In most cases, computations are defined and scheduled as CUDA graphs. In selected cases, scheduling kernels manually yields slightly better performance. To maximize parallelism, multiple CUDA streams are used. Furthermore, the code visibly benefits from a tree -based allreduce shared -memory -based algorithm. Additionally, if present within hardware, peer memory access is enabled between all GPUs and the allreduce algorithm takes advantage of it. This feature has made the UNRES coarse -grained protein model with implicit solvent scalable for multi-GPUs so that we could achieve almost 5 -fold speed-up with 8 A100 GPU accelerators for systems with over 200,000 amino -acid residues.
引用
收藏
页数:15
相关论文
共 98 条
  • [11] Enmyren J, 2010, HLPP 2010: PROCEEDINGS OF THE FOURTH INTERNATIONAL WORKSHOP ON HIGH-LEVEL PARALLEL PROGRAMMING AND APPLICATIONS, P5
  • [12] SkePU 3: Portable High-Level Programming of Heterogeneous Systems and HPC Clusters
    Ernstsson, August
    Ahlqvist, Johan
    Zouzoula, Stavroula
    Kessler, Christoph
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2021, 49 (06) : 846 - 866
  • [13] SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
    Ernstsson, August
    Li, Lu
    Kessler, Christoph
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (01) : 62 - 80
  • [14] Effects of α-tubulin acetylation on microtubule structure and stability
    Eshun-Wilson, Lisa
    Zhang, Rui
    Portran, Didier
    Nachury, Maxence V.
    Toso, Daniel B.
    Lohr, Thomas
    Vendruscolo, Michele
    Bonomi, Massimiliano
    Fraser, James S.
    Nogales, Eva
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2019, 116 (21) : 10366 - 10371
  • [15] Whole-Cell Models and Simulations in Molecular Detail
    Feig, Michael
    Sugita, Yuji
    [J]. ANNUAL REVIEW OF CELL AND DEVELOPMENTAL BIOLOGY, VOL 35, 2019, 35 : 191 - 211
  • [16] Frankel D., 2002, Understanding Molecular Simulation, Vsecond
  • [17] Gander W., 1997, Sci. Comput., V1997, P7385
  • [18] MOLECULAR DYNAMICS SIMULATIONS OF HELIX BUNDLE PROTEINS USING UNRES FORCE FIELD AND ALL-ATOM FORCE FIELD
    Gao, Kaifu
    Yang, Minghui
    [J]. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY, 2012, 11 (06) : 1201 - 1215
  • [19] From System Modeling to System Analysis: The Impact of Resolution Level and Resolution Distribution in the Computer-Aided Investigation of Biomolecules
    Giulini, Marco
    Rigoli, Marta
    Mattiotti, Giovanni
    Menichetti, Roberto
    Tarenzi, Thomas
    Fiorentini, Raffaele
    Potestio, Raffaello
    [J]. FRONTIERS IN MOLECULAR BIOSCIENCES, 2021, 8
  • [20] Strong scaling of general-purpose molecular dynamics simulations on GPUs
    Glaser, Jens
    Trung Dac Nguyen
    Anderson, Joshua A.
    Lui, Pak
    Spiga, Filippo
    Millan, Jaime A.
    Morse, David C.
    Glotzer, Sharon C.
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 2015, 192 : 97 - 107