Cross-validation strategies in QSPR modelling of chemical reactions

被引:14
|
作者
Rakhimbekova, A. [1 ]
Akhmetshin, T. N. [1 ,2 ]
Minibaeva, G. I. [1 ]
Nugmanov, R. I. [1 ]
Gimadiev, T. R. [3 ]
Madzhidov, T. I. [1 ]
Baskin, I. I. [4 ]
Varnek, A. [2 ]
机构
[1] Kazan Fed Univ, AM Butlerov Inst Chem, Kazan, Russia
[2] Univ Strasbourg, UMR 7140 CNRS, Lab Chemoinformat, Strasbourg, France
[3] Hokkaido Univ, Inst Chem React Design & Discovery, Sapporo, Hokkaido, Japan
[4] Technion Israel Inst Technol, Dept Mat Sci & Engn, Haifa, Israel
基金
俄罗斯科学基金会;
关键词
Validation; QSPR; chemical reactions; rate constant prediction; reaction rate; structure-reactivity modelling;
D O I
10.1080/1062936X.2021.1883107
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In this article, we consider cross-validation of the quantitative structure-property relationship models for reactions and show that the conventional k-fold cross-validation (CV) procedure gives an `optimistically' biased assessment of prediction performance. To address this issue, we suggest two strategies of model cross-validation, `transformation-out' CV, and `solvent-out' CV. Unlike the conventional k-fold cross-validation approach that does not consider the nature of objects, the proposed procedures provide an unbiased estimation of the predictive performance of the models for novel types of structural transformations in chemical reactions and reactions going under new conditions. Both the suggested strategies have been applied to predict the rate constants of bimolecular elimination and nucleophilic substitution reactions, and Diels-Alder cycloaddition. All suggested cross-validation methodologies and tutorial are implemented in the open-source software package CIMtools (https://github.com/cimm-kzn/CIMtools).
引用
收藏
页码:207 / 219
页数:13
相关论文
共 50 条
  • [21] A comparison of material flow strength models using Bayesian cross-validation
    Bernstein, Jason
    Schmidt, Kathleen
    Rivera, David
    Barton, Nathan
    Florando, Jeffrey
    Kupresanin, Ana
    COMPUTATIONAL MATERIALS SCIENCE, 2019, 169
  • [22] A Large-Scale Empirical Evaluation of Cross-Validation and External Test Set Validation in (Q)SAR
    Guetlein, Martin
    Helma, Christoph
    Karwath, Andreas
    Kramer, Stefan
    MOLECULAR INFORMATICS, 2013, 32 (5-6) : 516 - 528
  • [23] Systematic translation and cross-validation of defined implementation outcomes in health care services
    Gutt, Anna-Katharina
    Hoben, Matthias
    Roes, Martina
    Willmeroth, Tabea
    Wesselborg, Barbel
    Kuske, Silke
    ZEITSCHRIFT FUR EVIDENZ FORTBILDUNG UND QUALITAET IM GESUNDHEITSWESEN, 2018, 135 : 72 - 80
  • [24] Application and cross-validation of spatial logistic multiple regression for landslide susceptibility analysis
    Saro Lee
    Geosciences Journal, 2005, 9 : 63 - 71
  • [25] Application and cross-validation of spatial logistic multiple regression for landslide susceptibility analysis
    Lee, S
    GEOSCIENCES JOURNAL, 2005, 9 (01) : 63 - 71
  • [26] Three way k-fold cross-validation of resource selection functions
    Wiens, Trevor S.
    Dale, Brenda C.
    Boyce, Mark S.
    Kershaw, G. Peter
    ECOLOGICAL MODELLING, 2008, 212 (3-4) : 244 - 255
  • [27] Construct validity of the demand-control model: a double cross-validation approach
    Schreurs, PJG
    Taris, TW
    WORK AND STRESS, 1998, 12 (01): : 66 - 84
  • [28] Quantum k-fold cross-validation for nearest neighbor classification algorithm
    Li, Jing
    Gao, Fei
    Lin, Song
    Guo, Mingchao
    Li, Yongmei
    Liu, Hailing
    Qin, Sujuan
    Wen, QiaoYan
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2023, 611
  • [29] Assessing behavioural changes in ALS: cross-validation of ALS-specific measures
    Pinto-Grau, Marta
    Costello, Emmet
    O'Connor, Sarah
    Elamin, Marwa
    Burke, Tom
    Heverin, Mark
    Pender, Niall
    Hardiman, Orla
    JOURNAL OF NEUROLOGY, 2017, 264 (07) : 1397 - 1401
  • [30] Assessing behavioural changes in ALS: cross-validation of ALS-specific measures
    Marta Pinto-Grau
    Emmet Costello
    Sarah O’Connor
    Marwa Elamin
    Tom Burke
    Mark Heverin
    Niall Pender
    Orla Hardiman
    Journal of Neurology, 2017, 264 : 1397 - 1401