Evaluation of Machine Learning Models for Aqueous Solubility Prediction in Drug Discovery

被引:0
作者
Xue, Nian [1 ]
Zhang, Yuzhu [2 ]
Liu, Sensen [3 ]
机构
[1] NYU, Dept Comp Sc & Engn, New York, NY USA
[2] Carnegie Mellon Univ, Sch Comp Sc, Pittsburgh, PA 15213 USA
[3] Washington Univ, Dept Elect & Syst Engn, St Louis, MO 63110 USA
来源
2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA, ICAIBD 2024 | 2024年
关键词
Machine Learning; Solubility Prediction; Drug Discovery; Feature Importance; DESCRIPTORS; QSAR;
D O I
10.1109/ICAIBD62003.2024.10604556
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Determining the aqueous solubility of the chemical compound is of great importance in-silico drug discovery. However, correctly and rapidly predicting the aqueous solubility remains a challenging task. This paper explores and evaluates the predictability of multiple machine learning models in the aqueous solubility of compounds. Specifically, we apply a series of machine learning algorithms, including Random Forest, XG-Boost, LightGBM, and CatBoost, on a well-established aqueous solubility dataset (i.e., the Huuskonen dataset) of over 1200 compounds. Experimental results show that even traditional machine learning algorithms can achieve satisfactory performance with high accuracy. In addition, our investigation goes beyond mere prediction accuracy, delving into the interpretability of models to identify key features and understand the molecular properties that influence the predicted outcomes. This study sheds light on the ability to use machine learning approaches to predict compound solubility, significantly shortening the time that researchers spend on new drug discovery.
引用
收藏
页码:26 / 33
页数:8
相关论文
共 50 条
  • [21] Prediction of diffusion coefficients in aqueous systems by machine learning models
    Aniceto, Jose P. S.
    Zezere, Bruno
    Silva, Carlos M.
    JOURNAL OF MOLECULAR LIQUIDS, 2024, 405
  • [22] Novel Big Data-Driven Machine Learning Models for Drug Discovery Application
    Sripriya Akondi, Vishnu
    Menon, Vineetha
    Baudry, Jerome
    Whittle, Jana
    MOLECULES, 2022, 27 (03):
  • [23] Survey of Machine Learning Techniques in Drug Discovery
    Stephenson, Natalie
    Shane, Emily
    Chase, Jessica
    Rowland, Jason
    Ries, David
    Justice, Nicola
    Zhang, Jie
    Chan, Leong
    Cao, Renzhi
    CURRENT DRUG METABOLISM, 2019, 20 (03) : 185 - 193
  • [24] Machine learning for target discovery in drug development
    Rodrigues, Tiago
    Bernardes, Goncalo J. L.
    CURRENT OPINION IN CHEMICAL BIOLOGY, 2020, 56 : 16 - 22
  • [25] Evaluation of Machine Learning Models for Clinical Prediction Problems
    Sanchez-Pinto, L. Nelson
    Bennett, Tellen D.
    PEDIATRIC CRITICAL CARE MEDICINE, 2022, 23 (05) : 405 - 408
  • [26] Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery
    Lane, Thomas
    Russo, Daniel P.
    Zorn, Kimberley M.
    Clark, Alex M.
    Korotcov, Alexandru
    Tkachenko, Valery
    Reynolds, Robert C.
    Perryman, Alexander L.
    Freundlich, Joel S.
    Ekins, Sean
    MOLECULAR PHARMACEUTICS, 2018, 15 (10) : 4346 - 4360
  • [27] Recent development of machine learning models for the prediction of drug-drug interactions
    Hong, Eujin
    Jeon, Junhyeok
    Kim, Hyun Uk
    KOREAN JOURNAL OF CHEMICAL ENGINEERING, 2023, 40 (02) : 276 - 285
  • [28] Recent development of machine learning models for the prediction of drug-drug interactions
    Eujin Hong
    Junhyeok Jeon
    Hyun Uk Kim
    Korean Journal of Chemical Engineering, 2023, 40 : 276 - 285
  • [29] Multi-channel GCN ensembled machine learning model for molecular aqueous solubility prediction on a clean dataset
    Deng, Chenglong
    Liang, Li
    Xing, Guomeng
    Hua, Yi
    Lu, Tao
    Zhang, Yanmin
    Chen, Yadong
    Liu, Haichun
    MOLECULAR DIVERSITY, 2023, 27 (03) : 1023 - 1035
  • [30] Multi-channel GCN ensembled machine learning model for molecular aqueous solubility prediction on a clean dataset
    Chenglong Deng
    Li Liang
    Guomeng Xing
    Yi Hua
    Tao Lu
    Yanmin Zhang
    Yadong Chen
    Haichun Liu
    Molecular Diversity, 2023, 27 : 1023 - 1035