QSPRpred: a Flexible Open-Source Quantitative Structure-Property Relationship Modelling Tool

被引:0
|
作者
van den Maagdenberg, Helle W. [1 ]
Sicho, Martin [1 ,2 ]
Araripe, David Alencar [1 ,3 ]
Luukkonen, Sohvi [1 ,4 ,5 ]
Schoenmaker, Linde [1 ]
Jespers, Michiel [1 ]
Bequignon, Olivier J. M. [1 ,6 ]
Gonzalez, Marina Gorostiola [1 ,7 ]
van den Broek, Remco L. [1 ]
Bernatavicius, Andrius [1 ,8 ]
van Hasselt, J. G. Coen [1 ]
van der Graaf, Piet. H. [1 ,9 ]
van Westen, Gerard J. P. [1 ]
机构
[1] Leiden Univ, Leiden Acad Ctr Drug Res, Computat Drug Discovery, Einsteinweg 55, NL-2333 CC Leiden, Netherlands
[2] Univ Chem & Technol Prague, Fac Chem Technol, Dept Informat & Chem, CZ OPENSCREEN Natl Infrastruct Chem Biol, Tech 5, A-4040 Prague, Czech Republic
[3] Leiden Univ, Dept Human Genet, Med Ctr, Einthovenweg 20, NL-2333ZC Leiden, Netherlands
[4] Johannes Kepler Univ Linz, Inst Machine Learning, ELLIS Unit Linz, Altenberger Str 69, A-610101 Linz, Austria
[5] Johannes Kepler Univ Linz, Inst Machine Learning, LIT Lab, Altenberger Str 69, A-610101 Linz, Austria
[6] Univ Amsterdam, Brain Tumor Ctr Amsterdam, Canc Ctr Amsterdam, Dept Neurosurg,Med Ctr, De Boelelaan 1117, NL-1081 HV Amsterdam, Netherlands
[7] Oncode Inst, Utrecht, Netherlands
[8] Leiden Univ, Leiden Inst Adv Comp Sci, Niels Bohrweg 1, NL-2333 CA Leiden, Netherlands
[9] Canterbury Innovat Ctr, Unit 43, Certara UK, Univ Rd, Canterbury CT2 7FG, England
来源
JOURNAL OF CHEMINFORMATICS | 2024年 / 16卷 / 01期
关键词
QSPR modelling; QSAR modelling; Proteochemometrics; Cheminformatics; Machine learning; Software; PREDICTION; PROTEOCHEMOMETRICS; REPRODUCIBILITY; DESCRIPTORS; QSAR;
D O I
10.1186/s13321-024-00908-y
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Building reliable and robust quantitative structure-property relationship (QSPR) models is a challenging task. First, the experimental data needs to be obtained, analyzed and curated. Second, the number of available methods is continuously growing and evaluating different algorithms and methodologies can be arduous. Finally, the last hurdle that researchers face is to ensure the reproducibility of their models and facilitate their transferability into practice. In this work, we introduce QSPRpred, a toolkit for analysis of bioactivity data sets and QSPR modelling, which attempts to address the aforementioned challenges. QSPRpred's modular Python API enables users to intuitively describe different parts of a modelling workflow using a plethora of pre-implemented components, but also integrates customized implementations in a "plug-and-play" manner. QSPRpred data sets and models are directly serializable, which means they can be readily reproduced and put into operation after training as the models are saved with all required data pre-processing steps to make predictions on new compounds directly from SMILES strings. The general-purpose character of QSPRpred is also demonstrated by inclusion of support for multi-task and proteochemometric modelling. The package is extensively documented and comes with a large collection of tutorials to help new users. In this paper, we describe all of QSPRpred's functionalities and also conduct a small benchmarking case study to illustrate how different components can be leveraged to compare a diverse set of models. QSPRpred is fully open-source and available at https://github.com/CDDLeiden/QSPRpred.Scientific ContributionQSPRpred aims to provide a complex, but comprehensive Python API to conduct all tasks encountered in QSPR modelling from data preparation and analysis to model creation and model deployment. In contrast to similar packages, QSPRpred offers a wider and more exhaustive range of capabilities and integrations with many popular packages that also go beyond QSPR modelling. A significant contribution of QSPRpred is also in its automated and highly standardized serialization scheme, which significantly improves reproducibility and transferability of models.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Linear and nonlinear quantitative structure-property relationship modelling of skin permeability
    Khajeh, A.
    Modarress, H.
    SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2014, 25 (01) : 35 - 50
  • [2] Multivariate characterisation and quantitative structure-property relationship modelling of nitroaromatic compounds
    Jonsson, S.
    Eriksson, L. A.
    van Bavel, B.
    ANALYTICA CHIMICA ACTA, 2008, 621 (02) : 155 - 162
  • [3] Quantitative structure-property relationship modelling on autoignition temperature: evaluation and comparative analysis
    Chen, J.
    Zhu, L.
    Wang, J.
    SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2024, 35 (03) : 199 - 218
  • [4] Quantitative structure-property relationship modelling of thermal decomposition temperatures of ionic liquids
    Venkatraman, Vishwesh
    Alsberg, Bjorn Kare
    JOURNAL OF MOLECULAR LIQUIDS, 2016, 223 : 60 - 67
  • [5] Multivariate physicochemical characterisation and quantitative structure-property relationship modelling of polybrominated diphenyl ethers
    Harju, M
    Andersson, PL
    Haglund, P
    Tysklind, M
    CHEMOSPHERE, 2002, 47 (04) : 375 - 384
  • [6] Quantitative Structure-Property Relationship for Flash Points of Alcohols
    Khajeh, Aboozar
    Modarress, Hamid
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2011, 50 (19) : 11337 - 11342
  • [7] Bioinformatics and Quantitative Structure-Property Relationship (QSPR) Models
    Gonzalez-Diaz, Humberto
    CURRENT BIOINFORMATICS, 2013, 8 (04) : 387 - 389
  • [8] SpineOpt: A flexible open-source energy system modelling framework
    Ihlemann, Maren
    Kouveliotis-Lysikatos, Iasonas
    Huang, Jiangyi
    Dillon, Joseph
    O'Dwyer, Ciara
    Rasku, Topi
    Marin, Manuel
    Poncelet, Kris
    Kiviluoma, Juha
    ENERGY STRATEGY REVIEWS, 2022, 43
  • [9] Retina: An open-source tool for flexible analysis of RTC traffic
    Perna, Gianluca
    Markudova, Dena
    Trevisan, Martino
    Garza, Paolo
    Meo, Michela
    Munafo, Maurizio M.
    COMPUTER NETWORKS, 2022, 202
  • [10] Structurally "targeted" quantitative structure-property relationship method for property prediction
    Brauner, Neima
    Stateva, Roumiana P.
    St. Cholakov, Georgi
    Shacham, Mordechai
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2006, 45 (25) : 8430 - 8437