Machine learning methods to predict the crystallization propensity of small organic molecules

被引:11
|
作者
Pereira, Florbela [1 ,2 ]
机构
[1] Univ Nova Lisboa, Fac Ciencias & Tecnol, Dept Quim, LAQV, Caparica, Portugal
[2] Univ Nova Lisboa, Fac Ciencias & Tecnol, Dept Quim, REQUIMTE, Caparica, Portugal
关键词
CLASSIFICATION; STABILITY; TENDENCY;
D O I
10.1039/d0ce00070a
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Machine learning (ML) algorithms were explored for the prediction of the crystallization propensity based on molecular descriptors and fingerprints generated from 2D chemical structures and 3D molecular descriptors from 3D chemical structures optimized with empirical methods. In total, 57815 molecules were retrieved from the Reaxys (R) database, from those 53 998 molecules are recorded as crystalline (class A), 3097 as polymorphic (class B), and 720 as amorphous (class C). A training data set with 40 462 organic molecules was used to build the models, which were validated with an external test set comprising 17353 organic molecules. Several ML algorithms such as random forest (RF), support vector machines (SVM), and deep learning multilayer perceptron networks (MLP) were screened. The best performance was achieved with a consensus classification model obtained by RF, SVM, and MLP models, which predicted the external test set with an overall predictive accuracy (Q) of up to 80%.
引用
收藏
页码:2817 / 2826
页数:10
相关论文
共 50 条
  • [41] MLASM: Machine learning based prediction of anticancer small molecules
    Balaji, Priya Dharshini
    Selvam, Subathra
    Sohn, Honglae
    Madhavan, Thirumurthy
    MOLECULAR DIVERSITY, 2024, 28 (04) : 2153 - 2161
  • [42] Nuisance small molecules under a machine-learning lens
    Rodrigues, Tiago
    DIGITAL DISCOVERY, 2022, 1 (03): : 209 - 215
  • [43] Machine Learning guided early drug discovery of small molecules
    Pillai, Nikhil
    Dasgupta, Aparajita
    Sudsakorn, Sirimas
    Fretland, Jennifer
    Mavroudis, Panteleimon D.
    DRUG DISCOVERY TODAY, 2022, 27 (08) : 2209 - 2215
  • [44] Predicting the Propensity of Customers to Pay via Mobile Applications with Machine Learning Methods
    Ozkan, Ece
    Ceran, Berkan
    Merl, Buse
    Eskiocak, Defne Idil
    Yuceoglu, Birol
    INTELLIGENT AND FUZZY SYSTEMS, VOL 2, INFUS 2024, 2024, 1089 : 162 - 168
  • [45] Machine Learning Methods with Noisy, Incomplete or Small Datasets
    Caiafa, Cesar F.
    Sun, Zhe
    Tanaka, Toshihisa
    Marti-Puig, Pere
    Sole-Casals, Jordi
    APPLIED SCIENCES-BASEL, 2021, 11 (09):
  • [46] Quantitative Prediction of Hemolytic Toxicity for Small Molecules and Their Potential Hemolytic Fragments by Machine Learning and Recursive Fragmentation Methods
    Zheng, Suqing
    Xiong, Jun
    Wang, Yibing
    Liang, Guang
    Xu, Yong
    Lin, Fu
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (06) : 3231 - 3245
  • [47] Classification-based machine learning approaches to predict the taste of molecules: A review
    Rojas, Cristian
    Ballabio, Davide
    Consonni, Viviana
    Suarez-Estrella, Diego
    Todeschini, Roberto
    FOOD RESEARCH INTERNATIONAL, 2023, 171
  • [48] Leveraging informatics and machine learning to predict physical properties of organic compounds
    Liosi, Maria-Elena
    Spyriouni, Theodora
    Krokidis, Xenophon
    Subramanian, Lalitha
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 253
  • [49] Machine Learning for Orbital Energies of Organic Molecules Upwards of 100 Atoms
    Gaul, Christopher
    Cuesta-Lopez, Santiago
    PHYSICA STATUS SOLIDI B-BASIC SOLID STATE PHYSICS, 2024, 261 (01):
  • [50] Research Progress on New Organic Molecules Design via Machine Learning
    Tan, Pang
    Liu, Xuhong
    Chen, Tongtong
    Qin, Zhihui
    Yang, Tao
    Liu, Xiaotong
    Liu, Xiulei
    CHINESE JOURNAL OF ORGANIC CHEMISTRY, 2021, 41 (07) : 2666 - 2675