QMugs, quantum mechanical properties of drug-like molecules

被引:82
作者
Isert, Clemens [1 ]
Atz, Kenneth [1 ]
Jimenez-Luna, Jose [1 ,2 ]
Schneider, Gisbert [1 ,3 ]
机构
[1] Swiss Fed Inst Technol, Dept Chem & Appl Biosci, RETHINK, CH-8093 Zurich, Switzerland
[2] Boehringer Ingelheim Pharma GmbH & Co KG, Dept Med Chem, Birkendorfer Str 65, D-88397 Biberach, Germany
[3] ETH Singapore SEC Ltd, 1 CREATE Way,06-01 CREATE Tower, Singapore 138602, Singapore
基金
瑞士国家科学基金会;
关键词
ISOMERIZATION; DISPERSION; DESIGN;
D O I
10.1038/s41597-022-01390-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Machine learning approaches in drug discovery, as well as in other areas of the chemical sciences, benefit from curated datasets of physical molecular properties. However, there currently is a lack of data collections featuring large bioactive molecules alongside first-principle quantum chemical information. The open-access QMugs (Quantum-Mechanical Properties of Drug-like Molecules) dataset fills this void. The QMugs collection comprises quantum mechanical properties of more than 665 k biologically and pharmacologically relevant molecules extracted from the ChEMBL database, totaling similar to 2 M conformers. QMugs contains optimized molecular geometries and thermodynamic data obtained via the semi-empirical method GFN2-xTB. Atomic and molecular properties are provided on both the GFN2-xTB and on the density-functional levels of theory (DFT, omega B97X-D/def2-SVP). QMugs features molecules of significantly larger size than previously-reported collections and comprises their respective quantum mechanical wave functions, including DFT density and orbital matrices. This dataset is intended to facilitate the development of models that learn from molecular data on different levels of theory while also providing insight into the corresponding relationships between molecular structure and biological activity.
引用
收藏
页数:11
相关论文
共 57 条
[1]  
[Anonymous], 2020, Computational Chemistry Comparison and Benchmark Database: Precomputed Vibrational Scaling Factors
[2]   Ring Strain Energy in the Cyclooctyl System. The Effect of Strain Energy on [3+2] Cycloaddition Reactions with Azides [J].
Bach, Robert D. .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2009, 131 (14) :5233-5243
[3]   tmQM Dataset-Quantum Geometries and Properties of 86k Transition Metal Complexes [J].
Balcells, David ;
Skjelstad, Bastian Bjerkem .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (12) :6135-6146
[4]   Extendedtight-bindingquantum chemistry methods [J].
Bannwarth, Christoph ;
Caldeweyher, Eike ;
Ehlert, Sebastian ;
Hansen, Andreas ;
Pracht, Philipp ;
Seibert, Jakob ;
Spicher, Sebastian ;
Grimme, Stefan .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE, 2021, 11 (02)
[5]   GFN2-xTB-An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions [J].
Bannwarth, Christoph ;
Ehlert, Sebastian ;
Grimme, Stefan .
JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2019, 15 (03) :1652-1671
[6]   An open source chemical structure curation pipeline using RDKit [J].
Bento, A. Patricia ;
Hersey, Anne ;
Felix, Eloy ;
Landrum, Greg ;
Gaulton, Anna ;
Atkinson, Francis ;
Bellis, Louisa J. ;
De Veij, Marleen ;
Leach, Andrew R. .
JOURNAL OF CHEMINFORMATICS, 2020, 12 (01)
[7]   KNIME:: The Konstanz Information Miner [J].
Berthold, Michael R. ;
Cebron, Nicolas ;
Dill, Fabian ;
Gabriel, Thomas R. ;
Koetter, Tobias ;
Meinl, Thorsten ;
Ohl, Peter ;
Sieb, Christoph ;
Thiel, Kilian ;
Wiswedel, Bernd .
DATA ANALYSIS, MACHINE LEARNING AND APPLICATIONS, 2008, :319-326
[8]   PubChem3D: Conformer generation [J].
Bolton, Evan E. ;
Kim, Sunghwan ;
Bryant, Stephen H. .
JOURNAL OF CHEMINFORMATICS, 2011, 3
[9]   Long-range corrected hybrid density functionals with damped atom-atom dispersion corrections [J].
Chai, Jeng-Da ;
Head-Gordon, Martin .
PHYSICAL CHEMISTRY CHEMICAL PHYSICS, 2008, 10 (44) :6615-6620
[10]   OrbNet Denali: A machine learning potential for biological and organic chemistry with semi-empirical cost and DFT accuracy [J].
Christensen, Anders S. ;
Sirumalla, Sai Krishna ;
Qiao, Zhuoran ;
O'Connor, Michael B. ;
Smith, Daniel G. A. ;
Ding, Feizhi ;
Bygrave, Peter J. ;
Anandkumar, Animashree ;
Welborn, Matthew ;
Manby, Frederick R. ;
Miller, Thomas F., III .
JOURNAL OF CHEMICAL PHYSICS, 2021, 155 (20)