Development of Ensemble Learning Method Considering Applicability Domains

被引:0
|
作者
Sato, Keigo [1 ]
Kaneko, Hiromasa [1 ]
机构
[1] Meiji Univ, Sch Sci & Technol, Dept Appl Chem, Tokyo, Japan
关键词
Ensemble learning; Regression; Applicability domain; QSAR; QSPR; MODELS; PREDICTION; REGRESSION;
D O I
10.2477/jccj.2019-0010
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In quantitative structure-activity relationship and quantitative structure-physical relationship quantitatively, regression models are constructed activities and properties y, and molecular descriptors x for compounds. To improve predictive performance of models, multiple sub-models are constructed and a final y-value is predicted by integrating y-values predicted with sub-models in ensemble learning. Although it was confirmed that predictive performance improved by considering the applicability domain (AD) of each sub-model and by using only the sub-models inside AD, ADs cannot be compared between sub-datasets with different x. It was impossible to predict a y-value by selecting and weighting sub-models for a new sample. In this study, we focused on the similarity-weighted root-mean-square distance (wRMSD), which is an index of AD, and developed wRMSD-based AD considering ensemble learning (WEL), an ensemble learning method based on wRMSD. Since wRMSD is represented as the scale of y, AD can be compared between sub-models with different x, and thus, it is possible to predict a y-value, weighting sub-models having low wRMSD-values, which means high reliability of prediction, for a new sample. It was confirmed that AD was enlarged and predictive performance improved by using WEL compared to the conventional ensemble learning method through data analysis using three datasets of compounds for which water solubility, toxicity and pharmacological activity were measured. Python code for WEL is available at https://github.com/hkaneko1985/wel.
引用
收藏
页码:187 / 193
页数:7
相关论文
共 50 条
  • [1] Applicability Domains and Consistent Structure Generation
    Kaneko, Hiromasa
    Funatsu, Kimito
    MOLECULAR INFORMATICS, 2017, 36 (1-2)
  • [2] Ensemble Machine Learning and Applicability Domain Estimation for Fluorescence Properties and its Application to Structural Design
    Sugawara, Yuki
    Kotera, Masaaki
    Tanaka, Kenichi
    Funatsu, Kimito
    JOURNAL OF COMPUTER AIDED CHEMISTRY, 2019, 20 : 7 - 17
  • [3] An ensemble spatial prediction method considering geospatial heterogeneity
    Cheng, Shifen
    Wang, Lizeng
    Wang, Peixiao
    Lu, Feng
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2024, 38 (09) : 1856 - 1880
  • [4] Automatic outlier sample detection based on regression analysis and repeated ensemble learning
    Kaneko, Hiromasa
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2018, 177 : 74 - 82
  • [5] Discussion on Regression Methods Based on Ensemble Learning and Applicability Domains of Linear Submodels
    Kaneko, Hiromasa
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (02) : 480 - 489
  • [6] A new measure of regression model accuracy that considers applicability domains
    Kaneko, Hiromasa
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2017, 171 : 1 - 8
  • [7] Fast and scalable ensemble learning method for versatile polygenic risk prediction
    Chen, Tony
    Zhang, Haoyu
    Mazumder, Rahul
    Lin, Xihong
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (33)
  • [8] Comprehensive Analysis of Applicability Domains of QSPR Models for Chemical Reactions
    Rakhimbekova, Assima
    Madzhidov, Timur, I
    Nugmanov, Ramil, I
    Gimadiev, Timur R.
    Baskin, Igor I.
    Varnek, Alexandre
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (15) : 1 - 20
  • [9] Ensemble learning-based computational imaging method for electrical capacitance tomography
    Lei, J.
    Liu, Q. B.
    Wang, X. Y.
    APPLIED MATHEMATICAL MODELLING, 2020, 82 : 521 - 545
  • [10] Ensemble deep learning: A review
    Ganaie, M. A.
    Hu, Minghui
    Malik, A. K.
    Tanveer, M.
    Suganthan, P. N.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 115