Multi-GPU UNRES for scalable coarse-grained simulations of very large protein systems

被引:6
作者
Ocetkiewicz, Krzysztof M. [1 ]
Czaplewski, Cezary [1 ,2 ,3 ]
Krawczyk, Henryk [1 ,4 ]
Lipska, Agnieszka G. [1 ]
Liwo, Adam [1 ]
Proficz, Jerzy [1 ]
Sieradzan, Adam K. [1 ,2 ,3 ]
Czarnul, Pawel [4 ]
机构
[1] Gdansk Univ Technol, Fahrenheit Union Univ Gdansk, Ctr Informat Tricity Acad Supercomp & Network CI T, Narutowicza 11-12, PL-80233 Gdansk, Poland
[2] Univ Gdansk, Fahrenheit Union Univ Gdansk, Fac Chem, Wita Stwosza 63, PL-80308 Gdansk, Poland
[3] Korea Inst Adv Study, Sch Computat Sci, Seoul 02455, South Korea
[4] Gdansk Univ Technol, Fahrenheit Union Univ Gdansk, Fac Elect Telecommun & Informat, Narutowicza 11-12, PL-80233 Gdansk, Poland
关键词
Multi-GPU scalability; UNRES; Coarse graining; Protein dynamics; High performance computing; MOLECULAR-DYNAMICS SIMULATIONS; UNITED-RESIDUE MODEL; FORCE-FIELD; POLYPEPTIDE-CHAINS; FOLDING PATHWAYS; ALL-ATOM; TESTS; AGGREGATION; ALPHA; TEMPERATURE;
D O I
10.1016/j.cpc.2024.109112
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Graphical Processor Units (GPUs) are nowadays widely used in all -atom molecular simulations because of the advantage of efficient partitioning of atom pairs between the kernels to compute the contributions to energy and forces, thus enabling the treatment of very large systems. Extension of time- and size -scale of computations is also sought through the development of coarse -grained (CG) models, in which atoms are merged into extended interaction sites. Implementation of CG codes on the GPUs, particularly the multiple-GPU platforms is, however, a challenge due to more complicated potentials and removing the explicit solvent, forcing developers to do interaction- rather than space -domain decomposition. In this paper, we propose a design of a multi-GPU coarsegrained simulator and report the implementation of the heavily coarse -grained physics -based UNited RESidue (UNRES) model of polypeptide chains. By moving all computations to GPUs and keeping the communication with CPUs to a minimum, we managed to achieve almost 5 -fold speed-up with 8 A100 GPU accelerators for systems with over 200,000 amino -acid residues, this result making UNRES the best scalable coarse -grained software and enabling us to do laboratory -time millisecond -scale simulations of such cell components as tubulin within days of wall -clock time. Program summary Program Title: Multi-GPU UNRES CPC Library link to program files: https://doi .org /10 .17632 /hz9s4nwncf .1 Developer's repository link: https://projects .task .gda .pl /eurohpcpl -public /unres Licensing provisions: GPLv3 Programming language: Fortran + C++/CUDA Nature of problem: Physics -based simulations of protein systems at biologically relevant time- and size -scale are demanding and consequently require both the simplification of biomolecule representation and substantial computational resources. UNRES (from UNited RESidue) is a physics -based reduced model of polypeptide chains with which to run large-scale coarse -grained simulations of protein structure and dynamics. It enables the researchers to study protein folding, protein dynamics, and protein -protein interactions in a physically realistic manner and further unveil biological processes' mechanisms. Examples of biological applications include studies of amyloid formations, signaling mechanism, and action of molecular chaperones. Solution method: The presented Multi-GPU UNRES relies on a highly optimized GPU implementation of noncentral forces using modern CUDA constructs. Fundamentally, it is possible by proposed efficient partitioning and assignment of the interaction domain onto GPU resources. We moved as many computations as possible to the device (GPU) side. In most cases, computations are defined and scheduled as CUDA graphs. In selected cases, scheduling kernels manually yields slightly better performance. To maximize parallelism, multiple CUDA streams are used. Furthermore, the code visibly benefits from a tree -based allreduce shared -memory -based algorithm. Additionally, if present within hardware, peer memory access is enabled between all GPUs and the allreduce algorithm takes advantage of it. This feature has made the UNRES coarse -grained protein model with implicit solvent scalable for multi-GPUs so that we could achieve almost 5 -fold speed-up with 8 A100 GPU accelerators for systems with over 200,000 amino -acid residues.
引用
收藏
页数:15
相关论文
共 98 条
[31]   Kinetic studies of folding of the B-domain of staphylococcal protein A with molecular dynamics and a united-residue (UNRES) model of polypeptide chains [J].
Khalili, M ;
Liwo, A ;
Scheraga, HA .
JOURNAL OF MOLECULAR BIOLOGY, 2006, 355 (03) :536-547
[32]  
Khalilov Mikhail, 2021, Journal of Physics: Conference Series, V1740, DOI [10.1088/1742-6596/1740/1/012056, 10.1088/1742-6596/1740/1/012056]
[33]   The inverse of banded matrices [J].
Kilic, Emrah ;
Stanica, Pantelimon .
JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2013, 237 (01) :126-135
[34]   Coarse-Grained Protein Models and Their Applications [J].
Kmiecik, Sebastian ;
Gront, Dominik ;
Kolinski, Michal ;
Wieteska, Lukasz ;
Dawid, Aleksandra Elzbieta ;
Kolinski, Andrzej .
CHEMICAL REVIEWS, 2016, 116 (14) :7898-7936
[35]   Determination of Side-Chain-Rotamer and Side-Chain and Backbone Virtual-Bond-Stretching Potentials of Mean Force from AM1 Energy Surfaces of Terminally-Blocked Amino-Acid Residues, for Coarse-Grained Simulations of Protein Structure and Folding II: Results, Comparison with Statistical Potentials, and Implementation in the UNRES Force Field [J].
Kozlowska, Urszula ;
Maisuradze, Gia G. ;
Liwo, Adam ;
Scheraga, Harold A. .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2010, 31 (06) :1154-1167
[36]   Determination of Side-Chain-Rotamer and Side-Chain and Backbone Virtual-Bond-Stretching Potentials of Mean Force from AM1 Energy Surfaces of Terminally-Blocked Amino-Acid Residues, for Coarse-Grained Simulations of Protein Structure and Folding I: The Method [J].
Kozlowska, Urszula ;
Liwo, Adam ;
Scheraga, Harold A. .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2010, 31 (06) :1143-1153
[37]   Dynamics of Disulfide-Bond Disruption and Formation in the Thermal Unfolding of Ribonuclease A [J].
Krupa, Pawel ;
Sieradzan, Adam K. ;
Mozolewska, Magdalena A. ;
Li, Huiyu ;
Liwo, Adam ;
Scheraga, Harold A. .
JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2017, 13 (11) :5721-5730
[38]   Critical assessment of methods of protein structure prediction (CASP)-Round XIV [J].
Kryshtafovych, Andriy ;
Schwede, Torsten ;
Topf, Maya ;
Fidelis, Krzysztof ;
Moult, John .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2021, 89 (12) :1607-1617
[39]   GENERALIZED CUMULANT EXPANSION METHOD [J].
KUBO, R .
JOURNAL OF THE PHYSICAL SOCIETY OF JAPAN, 1962, 17 (07) :1100-&
[40]   SPFP: Speed without compromise-A mixed precision model for GPU accelerated molecular dynamics simulations [J].
Le Grand, Scott ;
Goetz, Andreas W. ;
Walker, Ross C. .
COMPUTER PHYSICS COMMUNICATIONS, 2013, 184 (02) :374-380