Chemical space-informed machine learning models for rapid predictions of x-ray photoelectron spectra of organic molecules

被引:0
|
作者
Tripathy, Susmita [1 ]
Das, Surajit [1 ]
Jindal, Shweta [1 ]
Ramakrishnan, Raghunathan [1 ]
机构
[1] Tata Inst Fundamental Res, Hyderabad 500046, India
来源
MACHINE LEARNING-SCIENCE AND TECHNOLOGY | 2024年 / 5卷 / 04期
关键词
x-ray photoelectron spectra; core-electron binding energy; density functional theory; machine learning; chemical space; LEVEL BINDING-ENERGIES; QUANTUM-CHEMISTRY; XPS SPECTRA; APPROXIMATION; SPECTROSCOPY; STATES; ATOMS;
D O I
10.1088/2632-2153/ad871d
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present machine learning models based on kernel-ridge regression for predicting x-ray photoelectron spectra of organic molecules originating from the K-shell ionization energies of carbon (C), nitrogen (N), oxygen (O), and fluorine (F) atoms. We constructed the training dataset through high-throughput calculations of K-shell core-electron binding energies (CEBEs) for 12 880 small organic molecules in the bigQM7 omega dataset, employing the Delta-SCF formalism coupled with meta-GGA-DFT and a variationally converged basis set. The models are cost-effective, as they require the atomic coordinates of a molecule generated using universal force fields while estimating the target-level CEBEs corresponding to DFT-level equilibrium geometry. We explore transfer learning by utilizing the atomic environment feature vectors learned using a graph neural network framework in kernel-ridge regression. Additionally, we enhance accuracy within the Delta-machine learning framework by leveraging inexpensive baseline spectra derived from Kohn-Sham eigenvalues. When applied to 208 combinatorially substituted uracil molecules larger than those in the training set, our analyses suggest that the models may not provide quantitatively accurate predictions of CEBEs but offer a strong linear correlation relevant for virtual high-throughput screening. We present the dataset and models as the Python module, cebeconf, to facilitate further explorations.
引用
收藏
页数:17
相关论文
共 41 条
  • [21] Search for Analytical Relations between X-Ray Absorption Spectra Descriptors and the Local Atomic Structure Using Machine Learning
    S. A. Guda
    A. S. Algasov
    A. A. Guda
    A. Martini
    A. N. Kravtsova
    A. L. Bugaev
    L. V. Guda
    A. V. Soldatov
    Journal of Surface Investigation: X-ray, Synchrotron and Neutron Techniques, 2021, 15 : 934 - 938
  • [22] Identifying Severity Grading of Knee Osteoarthritis from X-ray Images Using an Efficient Mixture of Deep Learning and Machine Learning Models
    Ahmed, Sozan Mohammed
    Mstafa, Ramadhan J.
    DIAGNOSTICS, 2022, 12 (12)
  • [23] Automated selection of nanoparticle models for small-angle X-ray scattering data analysis using machine learning
    Monge, Nicolas
    Deschamps, Alexis
    Amini, Massih-Reza
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2024, 80 : 202 - 212
  • [24] Development and evaluation of machine learning models based on X-ray radiomics for the classification and differentiation of malignant and benign bone tumors
    von Schacky, Claudio E.
    Wilhelm, Nikolas J.
    Schaefer, Valerie S.
    Leonhardt, Yannik
    Jung, Matthias
    Jungmann, Pia M.
    Russe, Maximilian F.
    Foreman, Sarah C.
    Gassert, Felix G.
    Gassert, Florian T.
    Schwaiger, Benedikt J.
    Mogler, Carolin
    Knebel, Carolin
    Von Eisenhart-Rothe, Ruediger
    Makowski, Marcus R.
    Woertler, Klaus
    Burgkart, Rainer
    Gersing, Alexandra S.
    EUROPEAN RADIOLOGY, 2022, 32 (09) : 6247 - 6257
  • [25] A diagnostic approach integrated multimodal radiomics with machine learning models based on lumbar spine CT and X-ray for osteoporosis
    Cheng, Liwei
    Cai, Fangqi
    Xu, Mingzhi
    Liu, Pan
    Liao, Jun
    Zong, Shaohui
    JOURNAL OF BONE AND MINERAL METABOLISM, 2023, 41 (06) : 877 - 889
  • [26] X-ray based radiomics machine learning models for predicting collapse of early-stage osteonecrosis of femoral head
    Yaqing He
    Yang Chen
    Yusen Chen
    Pingshi Li
    Le Yuan
    Maoxiao Ma
    Yuhao Liu
    Wei He
    Wu Zhou
    Leilei Chen
    Scientific Reports, 15 (1)
  • [27] A diagnostic approach integrated multimodal radiomics with machine learning models based on lumbar spine CT and X-ray for osteoporosis
    Liwei Cheng
    Fangqi Cai
    Mingzhi Xu
    Pan Liu
    Jun Liao
    Shaohui Zong
    Journal of Bone and Mineral Metabolism, 2023, 41 : 877 - 889
  • [28] An integrated framework with machine learning and radiomics for accurate and rapid early diagnosis of COVID-19 from Chest X-ray
    Tamal, Mahbubunnabi
    Alshammari, Maha
    Alabdullah, Meernah
    Hourani, Rana
    Abu Alola, Hossain
    Hegazi, Tarek M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 180
  • [29] Development and evaluation of machine learning models based on X-ray radiomics for the classification and differentiation of malignant and benign bone tumors
    Claudio E. von Schacky
    Nikolas J. Wilhelm
    Valerie S. Schäfer
    Yannik Leonhardt
    Matthias Jung
    Pia M. Jungmann
    Maximilian F. Russe
    Sarah C. Foreman
    Felix G. Gassert
    Florian T. Gassert
    Benedikt J. Schwaiger
    Carolin Mogler
    Carolin Knebel
    Ruediger von Eisenhart-Rothe
    Marcus R. Makowski
    Klaus Woertler
    Rainer Burgkart
    Alexandra S. Gersing
    European Radiology, 2022, 32 : 6247 - 6257
  • [30] Automatic Identification of COVID-19 in Chest X-Ray Images Based on Deep Features and Machine Learning Models
    Fonnegra, Ruben D.
    Narvaez, Fabian R.
    Diaz, Gloria M.
    SMART TECHNOLOGIES, SYSTEMS AND APPLICATIONS, SMARTTECH-IC 2021, 2022, 1532 : 360 - 369