The prediction of crystal densities of a big data set using 1D and 2D structure features

被引:0
|
作者
Li, Xianlan [1 ]
Kong, Dingling [1 ]
Luan, Yue [1 ]
Guo, Lili [1 ]
Lu, Yanhua [2 ]
Li, Wei [2 ]
Tang, Meng [3 ]
Zhang, Qingyou [1 ]
Pang, Aimin [2 ]
机构
[1] Henan Univ, Henan Engn Res Ctr Ind Circulating Water Treatment, Henan Joint Int Res Lab Environm Pollut Control Ma, Kaifeng 475004, Peoples R China
[2] Hubei Inst Aerosp Chemotechnol, Sci & Technol Aerosp Chem Power Lab, Xiangyang 441003, Hubei, Peoples R China
[3] Harbin Inst Technol, Sch Phys, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Density; Quantitative structure-property relationships; Big data set; Partial least squares; Random forest; NITRATE ESTERS; IONIC LIQUIDS; QSPR; ENTHALPIES; VAPORIZATION; EXPLOSIVES; NITRAMINES; SURFACE; HEAT;
D O I
10.1007/s11224-024-02279-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
A large data set of over 30 thousand organic compounds containing carbon, nitrogen, oxygen, fluorine, and hydrogen was collected, and the density of each compound was predicted by 1D descriptors derived from its molecular formula and 2D descriptors derived from its constitutional structural features. The 2D structural features are composed of Benson's groups, corrected groups, and 2D structural features of the whole molecular structures. All the descriptors were extracted by an in-house program in Java with a function to ensure that each atom (or bond) of molecules is represented by Benson's groups once for atom-based (or bond-based) descriptors. Partial least square (PLS) and random forest (RF) methods were used separately to build models to predict the density. Further, the variable selection of descriptors was performed by variable importance of RF. For partial least square, the combination of the models constructed by descriptors based on the atoms and the bonds achieved the best results in this paper: for the cross-validation of the training set, the Pearson correlation coefficient (R) = 0.9270, mean absolute error (MAE) = 0.0270 g center dot cm-3, and root mean squared error (RMSE) = 0.0426 g center dot cm-3; for the prediction of the test set, R = 0.9454, MAE = 0.0263 g center dot cm-3, and RMSE = 0.0375 g center dot cm-3.
引用
收藏
页码:1375 / 1385
页数:11
相关论文
共 50 条
  • [1] Stacking ensemble learning models for daily runoff prediction using 1D and 2D CNNs
    Xie, Yutong
    Sun, Wei
    Ren, Miaomiao
    Chen, Shu
    Huang, Zexi
    Pan, Xingyou
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 217
  • [2] Structural transformation in monolayer materials: a 2D to 1D transformation
    Momeni, Kasra
    Attariani, Hamed
    LeSar, Richard A.
    PHYSICAL CHEMISTRY CHEMICAL PHYSICS, 2016, 18 (29) : 19873 - 19879
  • [3] Colloidal branched CdSe/CdS 'nanospiders' with 2D/1D heterostructure
    Antanovich, Artsiom
    Prudnikau, Anatol
    Grzhegorzhevskii, Kirill
    Zelenovskiy, Pavel
    Ostroushko, Alexander
    Kuznetsov, Mikhail, V
    Chuvilin, Andrey
    Artemyev, Mikhail, V
    NANOTECHNOLOGY, 2018, 29 (39)
  • [4] Theoretical study of 1D and 2D fused magnesium porphine oligomers
    Charkin, O. P.
    Klimenko, N. M.
    RUSSIAN JOURNAL OF INORGANIC CHEMISTRY, 2013, 58 (09) : 1058 - 1069
  • [5] Plasmon induced polymerization using a TERS approach: a platform for nanostructured 2D/1D material production
    Zhang, Zhenglong
    Richard-Lacroix, Marie
    Deckert, Volker
    FARADAY DISCUSSIONS, 2017, 205 : 213 - 226
  • [6] Robust palmprint verification using 2D and 3D features
    Zhang, David
    Kanhangad, Vivek
    Luo, Nan
    Kumar, Ajay
    PATTERN RECOGNITION, 2010, 43 (01) : 358 - 368
  • [7] Determining Geometric Primitives for a 3D GIS Easy as 1D, 2D, 3D?
    Lonneville, Britt
    Stal, Cornelis
    De Roo, Berdien
    De Wulf, Alain
    De Maeyer, Philippe
    2015 1ST INTERNATIONAL CONFERENCE ON GEOGRAPHICAL INFORMATION SYSTEMS THEORY, APPLICATIONS AND MANAGEMENT (GISTAM), 2015, : 135 - 140
  • [8] Excitation of volume plasmon polaritons in metal-dielectric metamaterials using 1D and 2D diffraction grating
    Sreekanth, K. V.
    De Luca, A.
    Strangi, G.
    JOURNAL OF OPTICS, 2014, 16 (10)
  • [9] Numerical study of 1D and 2D advection-diffusion-reaction equations using Lucas and Fibonacci polynomials
    Ali, Ihteram
    Haq, Sirajul
    Nisar, Kottakkaran Sooppy
    Ul Arifeen, Shams
    ARABIAN JOURNAL OF MATHEMATICS, 2021, 10 (03) : 513 - 526
  • [10] Retrieval of wave information using nautical radar images based on the 2D CWT and 1D FFT algorithm
    Ju, Feng-zhou
    Chu, Xiao-liang
    Wang, Jian
    Gu, Yan-zhen
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTOMATION, MECHANICAL CONTROL AND COMPUTATIONAL ENGINEERING, 2015, 124 : 1379 - 1383