mdCATH: A Large-Scale MD Dataset for Data-Driven Computational Biophysics

被引:1
|
作者
Mirarchi, Antonio [1 ]
Giorgino, Toni [2 ]
De Fabritiis, Gianni [1 ,3 ,4 ]
机构
[1] Univ Pompeu Fabra, Computat Sci Lab, Barcelona Biomed Res Pk PRBB, Carrer Dr Aiguader 88, Barcelona 08003, Spain
[2] Natl Res Council CNR, Biophys Inst IBF, Via Celoria 26, I-20133 Milan, Italy
[3] Institucio Catalana Recerca i Estudis Avancats ICR, Passeig Lluis Co 23, Barcelona 08010, Spain
[4] Acellera Labs, Doctor Trueta 183, Barcelona 08005, Spain
基金
美国国家卫生研究院;
关键词
MOLECULAR-DYNAMICS SIMULATIONS; CATH;
D O I
10.1038/s41597-024-04140-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advancements in protein structure determination are revolutionizing our understanding of proteins. Still, a significant gap remains in the availability of comprehensive datasets that focus on the dynamics of proteins, which are crucial for understanding protein function, folding, and interactions. To address this critical gap, we introduce mdCATH, a dataset generated through an extensive set of all-atom molecular dynamics simulations of a diverse and representative collection of protein domains. This dataset comprises all-atom systems for 5,398 domains, modeled with a state-of-the-art classical force field, and simulated in five replicates each at five temperatures from 320 K to 450 K. The mdCATH dataset records coordinates and forces every 1 ns, for over 62 ms of accumulated simulation time, effectively capturing the dynamics of the various classes of domains and providing a unique resource for proteome-wide statistical analyses of protein unfolding thermodynamics and kinetics. We outline the dataset structure and showcase its potential through four easily reproducible case studies, highlighting its capabilities in advancing protein science.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Data-Driven Joint Resource Allocation in Large-scale Heterogeneous Wireless Networks
    Lin, Kai
    Li, Chensi
    Rodrigues, Joel J. P. C.
    Pace, Pasquale
    Fortino, Giancarlo
    IEEE NETWORK, 2020, 34 (03): : 163 - 169
  • [42] Data-Driven Decentralized Control for Large-Scale Systems With Sparsity and Communication Delays
    Li, Yan
    Zhang, Hao
    Wang, Zhuping
    Huang, Chao
    Yan, Huaicheng
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (09): : 5614 - 5624
  • [43] Large-scale scenarios of electric vehicle charging with a data-driven model of control
    Powell, Siobhan
    Cezar, Gustavo Vianna
    Apostolaki-Iosifidou, Elpiniki
    Rajagopal, Ram
    ENERGY, 2022, 248
  • [44] Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research
    Gulino, Cole
    Fu, Justin
    Luo, Wenjie
    Tucker, George
    Bronstein, Eli
    Lu, Yiren
    Harb, Jean
    Pan, Xinlei
    Wang, Yan
    Chen, Xiangyu
    Co-Reyes, John D.
    Agarwal, Rishabh
    Roelofs, Rebecca
    Lu, Yao
    Montali, Nico
    Mougin, Paul
    Yang, Zoey
    White, Brandyn
    Faust, Aleksandra
    McAllister, Rowan
    Anguelov, Dragomir
    Sapp, Benjamin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [45] Compounding as Abstract Operation in Semantic Space: Investigating relational effects through a large-scale, data-driven computational model
    Marelli, Marco
    Gagne, Christina L.
    Spalding, Thomas L.
    COGNITION, 2017, 166 : 207 - 224
  • [46] A simulation and data analysis system for large-scale, data-driven oil reservoir simulation studies
    Kurc, T
    Catalyurek, U
    Zhang, X
    Saltz, J
    Martino, R
    Wheeler, M
    Peszynska, M
    Sussman, A
    Hansen, C
    Sen, M
    Seifoullaev, R
    Stoffa, P
    Torres-Verdin, C
    Parashar, M
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2005, 17 (11): : 1441 - 1467
  • [47] Data-driven adaptive and stable feature selection method for large-scale industrial systems
    Zhu, Xiuli
    Song, Yan
    Wang, Peng
    Li, Ling
    Fu, Zixuan
    CONTROL ENGINEERING PRACTICE, 2024, 153
  • [48] Improving large-scale hierarchical classification by rewiring: a data-driven filter based approach
    Azad Naik
    Huzefa Rangwala
    Journal of Intelligent Information Systems, 2019, 52 : 141 - 164
  • [49] Data-driven methodologies for change detection in large-scale nonlinear dampers with noisy measurements
    Yun, Hae-Bum
    Masri, Sami F.
    Wolfe, Raymond W.
    Benzoni, Gianmario
    JOURNAL OF SOUND AND VIBRATION, 2009, 322 (1-2) : 336 - 357
  • [50] Data-driven fault detection for large-scale network systems: A mixed optimization approach
    Ma, Zhen-Lei
    Li, Xiao-Jian
    APPLIED MATHEMATICS AND COMPUTATION, 2022, 426