Machine learning methods to predict the crystallization propensity of small organic molecules

被引:11
|
作者
Pereira, Florbela [1 ,2 ]
机构
[1] Univ Nova Lisboa, Fac Ciencias & Tecnol, Dept Quim, LAQV, Caparica, Portugal
[2] Univ Nova Lisboa, Fac Ciencias & Tecnol, Dept Quim, REQUIMTE, Caparica, Portugal
关键词
CLASSIFICATION; STABILITY; TENDENCY;
D O I
10.1039/d0ce00070a
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Machine learning (ML) algorithms were explored for the prediction of the crystallization propensity based on molecular descriptors and fingerprints generated from 2D chemical structures and 3D molecular descriptors from 3D chemical structures optimized with empirical methods. In total, 57815 molecules were retrieved from the Reaxys (R) database, from those 53 998 molecules are recorded as crystalline (class A), 3097 as polymorphic (class B), and 720 as amorphous (class C). A training data set with 40 462 organic molecules was used to build the models, which were validated with an external test set comprising 17353 organic molecules. Several ML algorithms such as random forest (RF), support vector machines (SVM), and deep learning multilayer perceptron networks (MLP) were screened. The best performance was achieved with a consensus classification model obtained by RF, SVM, and MLP models, which predicted the external test set with an overall predictive accuracy (Q) of up to 80%.
引用
收藏
页码:2817 / 2826
页数:10
相关论文
共 50 条
  • [21] Mechanistic Study on the Effect of Magnetic Field on the Crystallization of Organic Small Molecules
    Zhao, Yihan
    Hou, Baohong
    Liu, Chunhao
    Ji, Xiongtao
    Huang, Yunhai
    Sui, Jingchen
    Liu, Dong
    Wang, Na
    Hao, Hongxun
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2021, 60 (43) : 15741 - 15751
  • [22] High-throughput nanoscale crystallization of small organic molecules and pharmaceuticals
    Metherall, J. P.
    Corner, P. A.
    McCabe, J. F.
    Hall, M. J.
    Probert, M. R.
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2024, 80
  • [23] NANO-CRYSTALLIZATION: APPLYING THE METHODS OF MACROMOLECULAR CRYSTALLOGRAPHY FOR SMALL MOLECULES
    Babor, Martin
    Nievergelt, Philipp P.
    Cejka, Jan
    Spingler, Bernhard
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2019, 75 : E660 - E660
  • [24] Nanotemplated crystallization of organic molecules
    Plain, Jerome
    Pallandre, Antoine
    Nysten, Bernard
    Jonas, Alain M.
    SMALL, 2006, 2 (07) : 892 - 897
  • [25] Predict the Polarizability and Order of Magnitude of Second Hyperpolarizability of Molecules by Machine Learning
    Zhao, Guoxiang
    Yan, Weiyin
    Wang, Zirui
    Kang, Yao
    Ma, Zuju
    Gu, Zhi-Gang
    Li, Qiao-Hong
    Zhang, Jian
    JOURNAL OF PHYSICAL CHEMISTRY A, 2023, 127 (29): : 6109 - 6115
  • [26] Machine learning to predict the specific optical rotations of chiral fluorinated molecules
    Chen, Mengyao
    Wu, Ting
    Xiao, Kaixia
    Zhao, Tanfeng
    Zhoa, Yanmei
    Zhang, Qingyou
    Aires-de-Sousa, Joao
    SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2019, 223
  • [27] TMCrys: predict propensity of success for transmembrane protein crystallization
    Varga, Julia K.
    Tusnady, Gabor E.
    BIOINFORMATICS, 2018, 34 (18) : 3126 - 3130
  • [28] Using Machine Learning To Predict Suitable Conditions for Organic Reactions
    Gao, Hanyu
    Struble, Thomas J.
    Coley, Connor W.
    Wang, Yuran
    Green, William H.
    Jensen, Klavs F.
    ACS CENTRAL SCIENCE, 2018, 4 (11) : 1465 - 1476
  • [29] Predicting crystallisation propensity of small molecules
    Wicker, J.
    Cooper, R.
    David, W.
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2014, 70 : C1628 - C1628
  • [30] Using Machine Learning to Predict the Dissociation Energy of Organic Carbonyls
    Yu, Haishan
    Wang, Ying
    Wang, Xijun
    Zhang, Jinxiao
    Ye, Sheng
    Huang, Yan
    Luo, Yi
    Sharman, Edward
    Chen, Shilu
    Jiang, Jun
    JOURNAL OF PHYSICAL CHEMISTRY A, 2020, 124 (19): : 3844 - 3850