Machine learning methods to predict the crystallization propensity of small organic molecules

被引:11
|
作者
Pereira, Florbela [1 ,2 ]
机构
[1] Univ Nova Lisboa, Fac Ciencias & Tecnol, Dept Quim, LAQV, Caparica, Portugal
[2] Univ Nova Lisboa, Fac Ciencias & Tecnol, Dept Quim, REQUIMTE, Caparica, Portugal
关键词
CLASSIFICATION; STABILITY; TENDENCY;
D O I
10.1039/d0ce00070a
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Machine learning (ML) algorithms were explored for the prediction of the crystallization propensity based on molecular descriptors and fingerprints generated from 2D chemical structures and 3D molecular descriptors from 3D chemical structures optimized with empirical methods. In total, 57815 molecules were retrieved from the Reaxys (R) database, from those 53 998 molecules are recorded as crystalline (class A), 3097 as polymorphic (class B), and 720 as amorphous (class C). A training data set with 40 462 organic molecules was used to build the models, which were validated with an external test set comprising 17353 organic molecules. Several ML algorithms such as random forest (RF), support vector machines (SVM), and deep learning multilayer perceptron networks (MLP) were screened. The best performance was achieved with a consensus classification model obtained by RF, SVM, and MLP models, which predicted the external test set with an overall predictive accuracy (Q) of up to 80%.
引用
收藏
页码:2817 / 2826
页数:10
相关论文
共 50 条
  • [1] Computing with molecules: Storage and machine learning using mixtures of small organic molecules
    Rubenstein, Brenda
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 257
  • [2] Machine learning to predict retention time of small molecules in nano-HPLC
    Osipenko, Sergey
    Bashkirova, Inga
    Sosnin, Sergey
    Kovaleva, Oxana
    Fedorov, Maxim
    Nikolaev, Eugene
    Kostyukevich, Yury
    ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2020, 412 (28) : 7767 - 7776
  • [3] Transferable Atomic Multipole Machine Learning Models for Small Organic Molecules
    Bereau, Tristan
    Andrienko, Denis
    von Lilienfeld, O. Anatole
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2015, 11 (07) : 3225 - 3233
  • [4] Machine learning to predict retention time of small molecules in nano-HPLC
    Sergey Osipenko
    Inga Bashkirova
    Sergey Sosnin
    Oxana Kovaleva
    Maxim Fedorov
    Eugene Nikolaev
    Yury Kostyukevich
    Analytical and Bioanalytical Chemistry, 2020, 412 : 7767 - 7776
  • [5] Machine learning methods for pKa prediction of small molecules: Advances and challenges
    Wu, Jialu
    Kang, Yu
    Pan, Peichen
    Hou, Tingjun
    DRUG DISCOVERY TODAY, 2022, 27 (12)
  • [6] Machine learning models to predict sweetness of molecules
    Goel, Mansi
    Sharma, Aditi
    Chilwal, Ayush Singh
    Kumari, Sakshi
    Kumar, Ayush
    Bagler, Ganesh
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 152
  • [7] Prediction of solvents suitable for crystallization of small organic molecules
    Hosokawa, K
    Goto, J
    Hirayama, N
    CHEMICAL & PHARMACEUTICAL BULLETIN, 2005, 53 (10) : 1296 - 1299
  • [8] An automated platform for parallel crystallization of small organic molecules
    Florence, Alastair J.
    Johnston, Andrea
    Fernandes, Philippe
    Shankland, Norman
    Shankland, Kenneth
    JOURNAL OF APPLIED CRYSTALLOGRAPHY, 2006, 39 : 922 - 924
  • [9] Identification of DNA adduct formation of small molecules by molecular descriptors and machine learning methods
    Rao, Hanbing
    Zeng, Xianyin
    Wang, Yanying
    He, Hua
    Zhu, Feng
    Li, Zerong
    Chen, Yuzong
    MOLECULAR SIMULATION, 2012, 38 (04) : 259 - 273
  • [10] Can machine learning methods predict beta?
    Alanis, Emmanuel
    Lesseig, Vance
    Payne, Janet D.
    Quijano, Margot
    APPLIED ECONOMICS, 2024,