mdCATH: A Large-Scale MD Dataset for Data-Driven Computational Biophysics

被引:1
|
作者
Mirarchi, Antonio [1 ]
Giorgino, Toni [2 ]
De Fabritiis, Gianni [1 ,3 ,4 ]
机构
[1] Univ Pompeu Fabra, Computat Sci Lab, Barcelona Biomed Res Pk PRBB, Carrer Dr Aiguader 88, Barcelona 08003, Spain
[2] Natl Res Council CNR, Biophys Inst IBF, Via Celoria 26, I-20133 Milan, Italy
[3] Institucio Catalana Recerca i Estudis Avancats ICR, Passeig Lluis Co 23, Barcelona 08010, Spain
[4] Acellera Labs, Doctor Trueta 183, Barcelona 08005, Spain
基金
美国国家卫生研究院;
关键词
MOLECULAR-DYNAMICS SIMULATIONS; CATH;
D O I
10.1038/s41597-024-04140-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advancements in protein structure determination are revolutionizing our understanding of proteins. Still, a significant gap remains in the availability of comprehensive datasets that focus on the dynamics of proteins, which are crucial for understanding protein function, folding, and interactions. To address this critical gap, we introduce mdCATH, a dataset generated through an extensive set of all-atom molecular dynamics simulations of a diverse and representative collection of protein domains. This dataset comprises all-atom systems for 5,398 domains, modeled with a state-of-the-art classical force field, and simulated in five replicates each at five temperatures from 320 K to 450 K. The mdCATH dataset records coordinates and forces every 1 ns, for over 62 ms of accumulated simulation time, effectively capturing the dynamics of the various classes of domains and providing a unique resource for proteome-wide statistical analyses of protein unfolding thermodynamics and kinetics. We outline the dataset structure and showcase its potential through four easily reproducible case studies, highlighting its capabilities in advancing protein science.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] An empirical study of large-scale data-driven full waveform inversion
    Jin, Peng
    Feng, Yinan
    Feng, Shihang
    Wang, Hanchen
    Chen, Yinpeng
    Consolvo, Benjamin
    Liu, Zicheng
    Lin, Youzuo
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [22] Natiolectal Variation in Dutch Morphosyntax: A Large-Scale, Data-Driven Perspective
    De Troij, Robbert
    Grondelaers, Stefan
    Speelman, Dirk
    JOURNAL OF GERMANIC LINGUISTICS, 2023, 35 (01) : 1 - 68
  • [23] Implementing Large-Scale Data-Driven Quality Improvement in Assisted Living
    Ramly, Edmond
    Parks, Reid
    Fishler, Theresa
    Ford, James H.
    Zimmerman, David
    Nordman-Oliveira, Susan
    JOURNAL OF THE AMERICAN MEDICAL DIRECTORS ASSOCIATION, 2022, 23 (02) : 280 - 287
  • [24] Sparse data-driven wavefront prediction for large-scale adaptive optics
    Cerqueira, Paulo
    Piscaer, Pieter
    Verhaegen, Michel
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2021, 38 (07) : 992 - 1002
  • [25] Domain Decomposition for Data-Driven Reduced Modeling of Large-Scale Systems
    Farcas, Ionut-Gabriel
    Gundevia, Rayomand P.
    Munipalli, Ramakanth
    Willcox, Karen E.
    AIAA JOURNAL, 2024, 62 (11) : 4071 - 4086
  • [26] Introduction to the special issue on data-driven and large-scale distributed simulations
    Cai, W.
    Aydt, H.
    JOURNAL OF SIMULATION, 2017, 11 (03) : 193 - 193
  • [27] Dungeons and Data: A Large-Scale NetHack Dataset
    Hambro, Eric
    Raileanu, Roberta
    Rothermel, Danielle
    Mella, Vegard
    Rocktaschel, Tim
    Kuttler, Heinrich
    Murray, Naila
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [28] Data-driven robust optimization for the itinerary planning via large-scale GPS data
    Wu, Lei
    Hifi, Mhand
    KNOWLEDGE-BASED SYSTEMS, 2021, 231
  • [29] Large-Scale Data-Driven Financial Risk Modeling using Big Data Technology
    Stockinger, Kurt
    Heitz, Jonas
    Bundi, Nils
    Breymann, Wolfgang
    2018 IEEE/ACM 5TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING APPLICATIONS AND TECHNOLOGIES (BDCAT), 2018, : 206 - 207
  • [30] A Data-Driven Krylov Model Order Reduction for Large-Scale Dynamical Systems
    Hamadi, M. A.
    Jbilou, K.
    Ratnani, A.
    JOURNAL OF SCIENTIFIC COMPUTING, 2023, 95 (01)