Cross-validation strategies in QSPR modelling of chemical reactions

被引:14
|
作者
Rakhimbekova, A. [1 ]
Akhmetshin, T. N. [1 ,2 ]
Minibaeva, G. I. [1 ]
Nugmanov, R. I. [1 ]
Gimadiev, T. R. [3 ]
Madzhidov, T. I. [1 ]
Baskin, I. I. [4 ]
Varnek, A. [2 ]
机构
[1] Kazan Fed Univ, AM Butlerov Inst Chem, Kazan, Russia
[2] Univ Strasbourg, UMR 7140 CNRS, Lab Chemoinformat, Strasbourg, France
[3] Hokkaido Univ, Inst Chem React Design & Discovery, Sapporo, Hokkaido, Japan
[4] Technion Israel Inst Technol, Dept Mat Sci & Engn, Haifa, Israel
基金
俄罗斯科学基金会;
关键词
Validation; QSPR; chemical reactions; rate constant prediction; reaction rate; structure-reactivity modelling;
D O I
10.1080/1062936X.2021.1883107
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In this article, we consider cross-validation of the quantitative structure-property relationship models for reactions and show that the conventional k-fold cross-validation (CV) procedure gives an `optimistically' biased assessment of prediction performance. To address this issue, we suggest two strategies of model cross-validation, `transformation-out' CV, and `solvent-out' CV. Unlike the conventional k-fold cross-validation approach that does not consider the nature of objects, the proposed procedures provide an unbiased estimation of the predictive performance of the models for novel types of structural transformations in chemical reactions and reactions going under new conditions. Both the suggested strategies have been applied to predict the rate constants of bimolecular elimination and nucleophilic substitution reactions, and Diels-Alder cycloaddition. All suggested cross-validation methodologies and tutorial are implemented in the open-source software package CIMtools (https://github.com/cimm-kzn/CIMtools).
引用
收藏
页码:207 / 219
页数:13
相关论文
共 50 条
  • [1] Modelling and monitoring of geological carbon storage: A perspective on cross-validation
    Jiang, Xi
    Hassan, Wasim A. Akber
    Gluyas, Jon
    APPLIED ENERGY, 2013, 112 : 784 - 792
  • [2] Validation and Cross-Validation Methods for ASCAT
    Anderson, Craig
    Figa-Saldana, Julia
    Wilson, John Julian William
    Ticconi, Francesca
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (05) : 2232 - 2239
  • [3] Leave-one-ion-out cross-validation for assisting in developing robust QSPR models of ionic liquids
    Liu, Xiao
    Yu, Mengxian
    Jia, Qingzhu
    Yan, Fangyou
    Zhou, Yin-Ning
    Wang, Qiang
    JOURNAL OF MOLECULAR LIQUIDS, 2023, 388
  • [4] Much faster cross-validation in PLSR-modelling by avoiding redundant calculations
    Liland, Kristian Hovde
    Stefansson, Petter
    Indahl, Ulf Geir
    JOURNAL OF CHEMOMETRICS, 2020, 34 (03)
  • [5] Modelling methods and cross-validation variants in QSAR: a multi-level analysis
    Racz, A.
    Bajusz, D.
    Heberger, K.
    SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2018, 29 (09) : 661 - 674
  • [6] Comprehensive Analysis of Applicability Domains of QSPR Models for Chemical Reactions
    Rakhimbekova, Assima
    Madzhidov, Timur, I
    Nugmanov, Ramil, I
    Gimadiev, Timur R.
    Baskin, Igor I.
    Varnek, Alexandre
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (15) : 1 - 20
  • [7] The Cross-Validation in the Dialogue of Mental and Neuroscience
    St Stoyanov, Drozdstoj
    DIALOGUES IN PHILOSOPHY MENTAL AND NEURO SCIENCES, 2009, 2 (01): : 24 - 28
  • [8] CROSS-VALIDATION OF BIOANALYTICAL METHODS BETWEEN LABORATORIES
    GILBERT, MT
    BARINOVCOLLIGON, I
    MIKSIC, JR
    JOURNAL OF PHARMACEUTICAL AND BIOMEDICAL ANALYSIS, 1995, 13 (4-5) : 385 - 394
  • [9] Cross-validation approaches for penalized Cox regression
    Dai, Biyue
    Breheny, Patrick
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2024, 33 (04) : 702 - 715
  • [10] Statistical Tests for Cross-Validation of Kriging Models
    Kleijnen, Jack P. C.
    van Beers, Wim C. M.
    INFORMS JOURNAL ON COMPUTING, 2022, 34 (01) : 607 - 621