Development of Ensemble Learning Method Considering Applicability Domains

被引:0
作者
Sato, Keigo [1 ]
Kaneko, Hiromasa [1 ]
机构
[1] Meiji Univ, Sch Sci & Technol, Dept Appl Chem, Tokyo, Japan
关键词
Ensemble learning; Regression; Applicability domain; QSAR; QSPR; MODELS; PREDICTION; REGRESSION;
D O I
10.2477/jccj.2019-0010
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In quantitative structure-activity relationship and quantitative structure-physical relationship quantitatively, regression models are constructed activities and properties y, and molecular descriptors x for compounds. To improve predictive performance of models, multiple sub-models are constructed and a final y-value is predicted by integrating y-values predicted with sub-models in ensemble learning. Although it was confirmed that predictive performance improved by considering the applicability domain (AD) of each sub-model and by using only the sub-models inside AD, ADs cannot be compared between sub-datasets with different x. It was impossible to predict a y-value by selecting and weighting sub-models for a new sample. In this study, we focused on the similarity-weighted root-mean-square distance (wRMSD), which is an index of AD, and developed wRMSD-based AD considering ensemble learning (WEL), an ensemble learning method based on wRMSD. Since wRMSD is represented as the scale of y, AD can be compared between sub-models with different x, and thus, it is possible to predict a y-value, weighting sub-models having low wRMSD-values, which means high reliability of prediction, for a new sample. It was confirmed that AD was enlarged and predictive performance improved by using WEL compared to the conventional ensemble learning method through data analysis using three datasets of compounds for which water solubility, toxicity and pharmacological activity were measured. Python code for WEL is available at https://github.com/hkaneko1985/wel.
引用
收藏
页码:187 / 193
页数:7
相关论文
共 50 条
  • [21] Machine learning based novel ensemble learning framework for electricity operational forecasting
    Weeraddana, Dilusha
    Khoa, Nguyen Lu Dang
    Mahdavi, Nariman
    ELECTRIC POWER SYSTEMS RESEARCH, 2021, 201
  • [22] Hierarchical Ensemble Learning for Alzheimer's Disease Classification
    Wang, Ruyue
    Li, Hanhui
    Lan, Rushi
    Luo, Suhuai
    Luo, Xiaonan
    2018 7TH INTERNATIONAL CONFERENCE ON DIGITAL HOME (ICDH 2018), 2018, : 224 - 229
  • [23] Forest fire forecasting using ensemble learning approaches
    Xie, Ying
    Peng, Minggang
    NEURAL COMPUTING & APPLICATIONS, 2019, 31 (09) : 4541 - 4550
  • [24] Characterization of applicability domains for QSAR models
    Wang, Zhongyu
    Chen, Jingwen
    Fu, Zhiqiang
    Li, Xuehua
    CHINESE SCIENCE BULLETIN-CHINESE, 2022, 67 (03): : 255 - 266
  • [25] An ensemble method of the machine learning to prognosticate the gastric cancer
    Rezaei, Hirad Baradaran
    Amjadian, Alireza
    Sebt, Mohammad Vahid
    Askari, Reza
    Gharaei, Abolfazl
    ANNALS OF OPERATIONS RESEARCH, 2023, 328 (01) : 151 - 192
  • [26] Ensemble Learning method for improving the Healthcare IoT System
    Raman, Ramakrishnan
    Chirputkar, Abhijit
    CARDIOMETRY, 2022, (25): : 171 - 177
  • [27] Assessment of Voting Ensemble for Estimating Software Development Effort
    Elish, Mahmoud O.
    2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), 2013, : 316 - 321
  • [28] Early Detection of Breast Cancer using Diffuse Optical Probe and Ensemble Learning Method
    Momtahen, Maryam
    Momtahen, Shadi
    Remaseshan, Ramani
    Golnaraghi, Farid
    2023 IEEE MTT-S INTERNATIONAL CONFERENCE ON NUMERICAL ELECTROMAGNETIC AND MULTIPHYSICS MODELING AND OPTIMIZATION, NEMO, 2023, : 139 - 142
  • [29] An ensemble learning framework for anomaly detection in building energy consumption
    Araya, Daniel B.
    Grolinger, Katarina
    ElYamany, Hany F.
    Capretz, Miriam A. M.
    Bitsuamlak, Girma
    ENERGY AND BUILDINGS, 2017, 144 : 191 - 206
  • [30] BoostTree and BoostForest for Ensemble Learning
    Zhao, Changming
    Wu, Dongrui
    Huang, Jian
    Yuan, Ye
    Zhang, Hai-Tao
    Peng, Ruimin
    Shi, Zhenhua
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (07) : 8110 - 8126