Multi-GPU UNRES for scalable coarse-grained simulations of very large protein systems

被引:2
作者
Ocetkiewicz, Krzysztof M. [1 ]
Czaplewski, Cezary [1 ,2 ,3 ]
Krawczyk, Henryk [1 ,4 ]
Lipska, Agnieszka G. [1 ]
Liwo, Adam [1 ]
Proficz, Jerzy [1 ]
Sieradzan, Adam K. [1 ,2 ,3 ]
Czarnul, Pawel [4 ]
机构
[1] Gdansk Univ Technol, Fahrenheit Union Univ Gdansk, Ctr Informat Tricity Acad Supercomp & Network CI T, Narutowicza 11-12, PL-80233 Gdansk, Poland
[2] Univ Gdansk, Fahrenheit Union Univ Gdansk, Fac Chem, Wita Stwosza 63, PL-80308 Gdansk, Poland
[3] Korea Inst Adv Study, Sch Computat Sci, Seoul 02455, South Korea
[4] Gdansk Univ Technol, Fahrenheit Union Univ Gdansk, Fac Elect Telecommun & Informat, Narutowicza 11-12, PL-80233 Gdansk, Poland
关键词
Multi-GPU scalability; UNRES; Coarse graining; Protein dynamics; High performance computing; MOLECULAR-DYNAMICS SIMULATIONS; UNITED-RESIDUE MODEL; FORCE-FIELD; POLYPEPTIDE-CHAINS; FOLDING PATHWAYS; ALL-ATOM; TESTS; AGGREGATION; ALPHA; TEMPERATURE;
D O I
10.1016/j.cpc.2024.109112
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Graphical Processor Units (GPUs) are nowadays widely used in all -atom molecular simulations because of the advantage of efficient partitioning of atom pairs between the kernels to compute the contributions to energy and forces, thus enabling the treatment of very large systems. Extension of time- and size -scale of computations is also sought through the development of coarse -grained (CG) models, in which atoms are merged into extended interaction sites. Implementation of CG codes on the GPUs, particularly the multiple-GPU platforms is, however, a challenge due to more complicated potentials and removing the explicit solvent, forcing developers to do interaction- rather than space -domain decomposition. In this paper, we propose a design of a multi-GPU coarsegrained simulator and report the implementation of the heavily coarse -grained physics -based UNited RESidue (UNRES) model of polypeptide chains. By moving all computations to GPUs and keeping the communication with CPUs to a minimum, we managed to achieve almost 5 -fold speed-up with 8 A100 GPU accelerators for systems with over 200,000 amino -acid residues, this result making UNRES the best scalable coarse -grained software and enabling us to do laboratory -time millisecond -scale simulations of such cell components as tubulin within days of wall -clock time. Program summary Program Title: Multi-GPU UNRES CPC Library link to program files: https://doi .org /10 .17632 /hz9s4nwncf .1 Developer's repository link: https://projects .task .gda .pl /eurohpcpl -public /unres Licensing provisions: GPLv3 Programming language: Fortran + C++/CUDA Nature of problem: Physics -based simulations of protein systems at biologically relevant time- and size -scale are demanding and consequently require both the simplification of biomolecule representation and substantial computational resources. UNRES (from UNited RESidue) is a physics -based reduced model of polypeptide chains with which to run large-scale coarse -grained simulations of protein structure and dynamics. It enables the researchers to study protein folding, protein dynamics, and protein -protein interactions in a physically realistic manner and further unveil biological processes' mechanisms. Examples of biological applications include studies of amyloid formations, signaling mechanism, and action of molecular chaperones. Solution method: The presented Multi-GPU UNRES relies on a highly optimized GPU implementation of noncentral forces using modern CUDA constructs. Fundamentally, it is possible by proposed efficient partitioning and assignment of the interaction domain onto GPU resources. We moved as many computations as possible to the device (GPU) side. In most cases, computations are defined and scheduled as CUDA graphs. In selected cases, scheduling kernels manually yields slightly better performance. To maximize parallelism, multiple CUDA streams are used. Furthermore, the code visibly benefits from a tree -based allreduce shared -memory -based algorithm. Additionally, if present within hardware, peer memory access is enabled between all GPUs and the allreduce algorithm takes advantage of it. This feature has made the UNRES coarse -grained protein model with implicit solvent scalable for multi-GPUs so that we could achieve almost 5 -fold speed-up with 8 A100 GPU accelerators for systems with over 200,000 amino -acid residues.
引用
收藏
页数:15
相关论文
共 98 条
  • [1] Gromacs: High performance molecular simulations through multi-level parallelism from laptops to supercomputers
    Abraham, Mark James
    Murtola, Teemu
    Schulz, Roland
    Páll, Szilárd
    Smith, Jeremy C.
    Hess, Berk
    Lindah, Erik
    [J]. SoftwareX, 2015, 1-2 : 19 - 25
  • [2] Modeling protein structures with the coarse-grained UNRES force field in the CASP14 experiment
    Antoniak, Anna
    Biskupek, Iga
    Bojarski, Krzysztof K.
    Czaplewski, Cezary
    Gieldon, Artur
    Kogut, Mateusz
    Kogut, Malgorzata M.
    Krupa, Pawel
    Lipska, Agnieszka G.
    Liwo, Adam
    Lubecka, Emilia A.
    Marcisz, Mateusz
    Maszota-Zieleniak, Martyna
    Samsonov, Sergey A.
    Sieradzan, Adam K.
    Slusarz, Magdalena J.
    Slusarz, Rafal
    Wesolowski, Patryk A.
    Zieba, Karolina
    [J]. JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2021, 108
  • [3] Prediction of Aggregation of Biologically-Active Peptides with the UNRES Coarse-Grained Model
    Biskupek, Iga
    Czaplewski, Cezary
    Sawicka, Justyna
    Ilowska, Emilia
    Dzierzynska, Maria
    Rodziewicz-Motowidlo, Sylwia
    Liwo, Adam
    [J]. BIOMOLECULES, 2022, 12 (08)
  • [4] Pragmatic Coarse-Graining of Proteins: Models and Applications
    Borges-Araujo, Luis
    Patmanidis, Ilias
    Singh, Akhil P.
    Santos, Lucianna H. S.
    Sieradzan, Adam K.
    Vanni, Stefano
    Czaplewski, Cezary
    Pantano, Sergio
    Shinoda, Wataru
    Monticelli, Luca
    Liwo, Adam
    Marrink, Siewert J.
    Souza, Paulo C. T.
    [J]. JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2023, 19 (20) : 7112 - 7135
  • [5] Dynamic formation and breaking of disulfide bonds in molecular dynamics simulations with the UNRES force field
    Chinchio, M.
    Czaplewski, C.
    Liwo, A.
    Oldziej, S.
    Scheraga, H. A.
    [J]. JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2007, 3 (04) : 1236 - 1248
  • [6] Application of Multiplexed Replica Exchange Molecular Dynamics to the UNRES Force Field: Tests with α and α plus β Proteins
    Czaplewski, Cezary
    Kalinowski, Sebastian
    Liwo, Adam
    Scheraga, Harold A.
    [J]. JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2009, 5 (03) : 627 - 640
  • [7] Parallelization of large vector similarity computations in a hybrid CPU plus GPU environment
    Czarnul, Pawe
    [J]. JOURNAL OF SUPERCOMPUTING, 2018, 74 (02) : 768 - 786
  • [8] INVESTIGATION OF PARALLEL DATA PROCESSING USING HYBRID HIGH PERFORMANCE CPU plus GPU SYSTEMS AND CUDA STREAMS
    Czarnul, Pawel
    [J]. COMPUTING AND INFORMATICS, 2020, 39 (03) : 510 - 536
  • [9] Czarnul Pawel, 2018, Parallel Programming for Modern High Performance Computing Systems
  • [10] SIRAH: A Structurally Unbiased Coarse-Grained Force Field for Proteins with Aqueous Solvation and Long-Range Electrostatics
    Darre, Leonardo
    Rodrigo Machado, Matias
    Febe Brandner, Astrid
    Carlos Gonzalez, Humberto
    Ferreira, Sebastian
    Pantano, Sergio
    [J]. JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2015, 11 (02) : 723 - 739