The prediction of crystal densities of a big data set using 1D and 2D structure features

被引:0
|
作者
Li, Xianlan [1 ]
Kong, Dingling [1 ]
Luan, Yue [1 ]
Guo, Lili [1 ]
Lu, Yanhua [2 ]
Li, Wei [2 ]
Tang, Meng [3 ]
Zhang, Qingyou [1 ]
Pang, Aimin [2 ]
机构
[1] Henan Univ, Henan Engn Res Ctr Ind Circulating Water Treatment, Henan Joint Int Res Lab Environm Pollut Control Ma, Kaifeng 475004, Peoples R China
[2] Hubei Inst Aerosp Chemotechnol, Sci & Technol Aerosp Chem Power Lab, Xiangyang 441003, Hubei, Peoples R China
[3] Harbin Inst Technol, Sch Phys, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Density; Quantitative structure-property relationships; Big data set; Partial least squares; Random forest; NITRATE ESTERS; IONIC LIQUIDS; QSPR; ENTHALPIES; VAPORIZATION; EXPLOSIVES; NITRAMINES; SURFACE; HEAT;
D O I
10.1007/s11224-024-02279-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
A large data set of over 30 thousand organic compounds containing carbon, nitrogen, oxygen, fluorine, and hydrogen was collected, and the density of each compound was predicted by 1D descriptors derived from its molecular formula and 2D descriptors derived from its constitutional structural features. The 2D structural features are composed of Benson's groups, corrected groups, and 2D structural features of the whole molecular structures. All the descriptors were extracted by an in-house program in Java with a function to ensure that each atom (or bond) of molecules is represented by Benson's groups once for atom-based (or bond-based) descriptors. Partial least square (PLS) and random forest (RF) methods were used separately to build models to predict the density. Further, the variable selection of descriptors was performed by variable importance of RF. For partial least square, the combination of the models constructed by descriptors based on the atoms and the bonds achieved the best results in this paper: for the cross-validation of the training set, the Pearson correlation coefficient (R) = 0.9270, mean absolute error (MAE) = 0.0270 g center dot cm-3, and root mean squared error (RMSE) = 0.0426 g center dot cm-3; for the prediction of the test set, R = 0.9454, MAE = 0.0263 g center dot cm-3, and RMSE = 0.0375 g center dot cm-3.
引用
收藏
页码:1375 / 1385
页数:11
相关论文
共 50 条
  • [41] 1D 13C-NMR Data as Molecular Descriptors in Spectra - Structure Relationship Analysis of Oligosaccharides
    Pereira, Florbela
    MOLECULES, 2012, 17 (04): : 3818 - 3833
  • [42] A New 1D/2D Coupled Modeling Approach for a Riverine-Estuarine System Under Storm Events: Application to Delaware River Basin
    Bakhtyar, R.
    Maitaria, K.
    Velissariou, P.
    Trimble, B.
    Mashriqui, H.
    Moghimi, S.
    Abdolali, A.
    Van der Westhuysen, A. J.
    Ma, Z.
    Clark, E. P.
    Flowers, T.
    JOURNAL OF GEOPHYSICAL RESEARCH-OCEANS, 2020, 125 (09)
  • [43] Deducing 2D Crystal Structure at the Liquid/Solid Interface with Atomic Resolution: A Combined STM and SFG Study
    McClelland, Arthur A.
    Ahn, Seokhoon
    Matzger, Adam J.
    Chen, Zhan
    LANGMUIR, 2009, 25 (22) : 12847 - 12850
  • [44] Enamel wear evolution: Evaluation using statistical mixed models for 2D profilometry data
    Meireles, Agnes Batista
    Alvernaz Marques Ferreira, Janaina Luciana
    Bastos, Flivia de Souza
    Bonato, Leticia
    de Las Casas, Estevam Barbosa
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART L-JOURNAL OF MATERIALS-DESIGN AND APPLICATIONS, 2019, 233 (08) : 1500 - 1509
  • [45] Tuning the electronic properties of highly anisotropic 2D dangling-bond-free sheets from 1D V2Se9 chain structures
    Lee, Weon-Gyu
    Sung, Dongchul
    Lee, Junho
    Chung, You Kyoung
    Kim, Bum Jun
    Choi, Kyung Hwan
    Lee, Sang Hoon
    Jeong, Byung Joo
    Choi, Jae-Young
    Huh, Joonsuk
    NANOTECHNOLOGY, 2021, 32 (09)
  • [46] Biological and mechanical evaluation of poly(lactic-co-glycolic acid)-based composites reinforced with 1D, 2D and 3D carbon biomaterials for bone tissue regeneration
    Kaur, Tejinder
    Kulanthaivel, Senthilguru
    Thirugnanam, Arunachalam
    Banerjee, Indranil
    Pramanik, Krishna
    BIOMEDICAL MATERIALS, 2017, 12 (02)
  • [47] 1D/2D nitrogen-doped carbon nanorod arrays/ultrathin carbon nanosheets: outstanding catalysts for the highly efficient electroreduction of CO2 to CO
    Zhu, Ying
    Lv, Kuilin
    Wang, Xingpu
    Yang, Hequn
    Xiao, Guozheng
    JOURNAL OF MATERIALS CHEMISTRY A, 2019, 7 (24) : 14895 - 14903
  • [48] Comparing 2D and 3D Feature Extraction Methods for Lung Adenocarcinoma Prediction Using CT Scans: A Cross-Cohort Study
    Gouveia, Margarida
    Mendes, Tania
    Rodrigues, Eduardo M.
    Oliveira, Helder P.
    Pereira, Tania
    APPLIED SCIENCES-BASEL, 2025, 15 (03):
  • [49] Fabrication of 1D/2D p-g-C3N4@RGO heterostructures with superior visible-light photoelectrochemical cathodic protection performance
    Qian, Bei
    Yang, Xulan
    Li, Xuyun
    Song, Zuwei
    JOURNAL OF SOLID STATE ELECTROCHEMISTRY, 2020, 24 (07) : 1669 - 1678
  • [50] Dimension switchable auto-fluorescent peptide-based 1D and 2D nano-assemblies and their self-influence on intracellular fate and drug delivery
    Chibh, Sonika
    Kaur, Komalpreet
    Gautam, Ujjal K.
    Panda, Jiban Jyoti
    NANOSCALE, 2022, 14 (03) : 715 - 735