The prediction of crystal densities of a big data set using 1D and 2D structure features

被引:0
|
作者
Li, Xianlan [1 ]
Kong, Dingling [1 ]
Luan, Yue [1 ]
Guo, Lili [1 ]
Lu, Yanhua [2 ]
Li, Wei [2 ]
Tang, Meng [3 ]
Zhang, Qingyou [1 ]
Pang, Aimin [2 ]
机构
[1] Henan Univ, Henan Engn Res Ctr Ind Circulating Water Treatment, Henan Joint Int Res Lab Environm Pollut Control Ma, Kaifeng 475004, Peoples R China
[2] Hubei Inst Aerosp Chemotechnol, Sci & Technol Aerosp Chem Power Lab, Xiangyang 441003, Hubei, Peoples R China
[3] Harbin Inst Technol, Sch Phys, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Density; Quantitative structure-property relationships; Big data set; Partial least squares; Random forest; NITRATE ESTERS; IONIC LIQUIDS; QSPR; ENTHALPIES; VAPORIZATION; EXPLOSIVES; NITRAMINES; SURFACE; HEAT;
D O I
10.1007/s11224-024-02279-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
A large data set of over 30 thousand organic compounds containing carbon, nitrogen, oxygen, fluorine, and hydrogen was collected, and the density of each compound was predicted by 1D descriptors derived from its molecular formula and 2D descriptors derived from its constitutional structural features. The 2D structural features are composed of Benson's groups, corrected groups, and 2D structural features of the whole molecular structures. All the descriptors were extracted by an in-house program in Java with a function to ensure that each atom (or bond) of molecules is represented by Benson's groups once for atom-based (or bond-based) descriptors. Partial least square (PLS) and random forest (RF) methods were used separately to build models to predict the density. Further, the variable selection of descriptors was performed by variable importance of RF. For partial least square, the combination of the models constructed by descriptors based on the atoms and the bonds achieved the best results in this paper: for the cross-validation of the training set, the Pearson correlation coefficient (R) = 0.9270, mean absolute error (MAE) = 0.0270 g center dot cm-3, and root mean squared error (RMSE) = 0.0426 g center dot cm-3; for the prediction of the test set, R = 0.9454, MAE = 0.0263 g center dot cm-3, and RMSE = 0.0375 g center dot cm-3.
引用
收藏
页码:1375 / 1385
页数:11
相关论文
共 50 条
  • [31] Intra- and inter-molecular interactions in choline-based ionic liquids studied by 1D and 2D NMR
    Veroutis, Emmanouil
    Merz, Steffen
    Eichel, Rudiger A.
    Granwehr, Josef
    JOURNAL OF MOLECULAR LIQUIDS, 2021, 322
  • [32] Synergism of 1D CdS/2D Modified Ti3C2Tx MXene Heterojunctions for Boosted Photocatalytic Hydrogen Production
    Cheng, Shi
    Xiong, Qianqian
    Zhao, Chengxiao
    Yang, Xiaofei
    CHINESE JOURNAL OF STRUCTURAL CHEMISTRY, 2022, 41 (08) : 2208058 - 2208064
  • [33] Synthesis, Crystal Structure and Comparative DFT Studies of a 1D Ni(II) Polymeric Complex of 2-Hydroxypyridine-N-oxide
    Makhyoun, Mohamed A.
    Palmer, Rex A.
    Soayed, Amina A.
    Refaat, Heba M.
    Basher, Dina E.
    JOURNAL OF CHEMICAL CRYSTALLOGRAPHY, 2016, 46 (6-7) : 269 - 279
  • [34] Constructing boron-doped graphitic carbon nitride with 2D/1D porous hierarchical architecture and efficient N2 photofixation
    Yu, Jing
    Xiong, Shihao
    Wang, Bichen
    Wang, Rui
    He, Beibei
    Jin, Jun
    Wang, Huanwen
    Gong, Yansheng
    COLLOIDS AND SURFACES A-PHYSICOCHEMICAL AND ENGINEERING ASPECTS, 2023, 656
  • [35] GLOBAL REGULARITY TO THE 2D INHOMOGENEOUS LIQUID CRYSTAL FLOWS WITH LARGE INITIAL DATA AND VACUUM
    Liu, Y. A. N. G.
    Guo, R. E. N. Y. I. N. G.
    Zhou, N. A. N.
    ROCKY MOUNTAIN JOURNAL OF MATHEMATICS, 2022, 52 (06) : 2085 - 2099
  • [36] 1D and 2D NMR Spectroscopy of Bonding Interactions within Stable and Phase-Separating Organic Electrolyte-Cellulose Solutions
    Clough, Matthew T.
    Fares, Christophe
    Rinaldi, Roberto
    CHEMSUSCHEM, 2017, 10 (17) : 3452 - 3458
  • [37] Optical nonlinear effects of nickel and cobalt substituents in 1D/2D manganese tungstate/rGO nanocomposite for smart filtering optical radiation
    Madhubala, V.
    Sahni, Amegha
    Sujatha, R. Annie
    JOURNAL OF PHOTOCHEMISTRY AND PHOTOBIOLOGY A-CHEMISTRY, 2023, 438
  • [38] 2D joint inversion of geophysical data using petrophysical clustering and facies deformation
    Zhang, J.
    Revil, A.
    GEOPHYSICS, 2015, 80 (05) : M69 - M88
  • [39] Prediction of reverse electrodialysis performance by inclusion of 2D fluorescence spectroscopy data into multivariate statistical models
    Pawlowski, Sylwin
    Galinha, Claudia F.
    Crespo, Joao G.
    Velizarov, Svetlozar
    SEPARATION AND PURIFICATION TECHNOLOGY, 2015, 150 : 159 - 169
  • [40] Prediction of the anomeric configuration, type of linkage, and residues in disaccharides from 1D 13C NMR data
    Pereira, Florbela
    CARBOHYDRATE RESEARCH, 2011, 346 (07) : 960 - 972