Extracting Structural Information from Physicochemical Property Measurements Using Machine Learning-A New Approach for Structure Elucidation in Non-targeted Analysis

被引:5
作者
Abrahamsson, Dimitri [1 ,2 ]
Brueck, Christopher L. [3 ,4 ]
Prasse, Carsten [3 ,5 ]
Lambropoulou, Dimitra A. [6 ,7 ,8 ]
Koronaiou, Lelouda-Athanasia [6 ,7 ,8 ]
Wang, Miaomiao [9 ]
Park, June-Soo [2 ,9 ]
Woodruff, Tracey J. [2 ]
机构
[1] NYU, Dept Pediat, Grossman Sch Med, New York, NY 10016 USA
[2] Univ Calif San Francisco, Program Reprod Hlth & Environm, Dept Obstet Gynecol & Reprod Sci, San Francisco, CA 94107 USA
[3] Johns Hopkins Univ, Dept Environm Hlth & Engn, Baltimore, MD 21205 USA
[4] Exponent Environm & Earth Sci Practice, Bellevue, WA 98007 USA
[5] Johns Hopkins Univ, Risk Sci & Publ Policy Inst, Bloomberg Sch Publ Hlth, Baltimore, MD 21205 USA
[6] Aristotle Univ Thessaloniki, Dept Chem, Univ Campus, Thessaloniki 54124, Greece
[7] Aristotle Univ Thessaloniki, Dept Chem, Lab Environm Pollut Control, GR-54124 Thessaloniki, Greece
[8] Ctr Interdisciplinary Res & Innovat CIRI AUTH, Balkan Ctr, GR-57001 Thessaloniki, Greece
[9] Calif Environm Agcy, Environm Chem Lab, Dept Tox Subst Control, Berkeley, CA 94710 USA
关键词
non-targeted analysis; machine learning; physicochemical properties; structure elucidation;
D O I
10.1021/acs.est.3c03003
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Non-targeted analysis (NTA) has made critical contributions in the fields of environmental chemistry and environmental health. One critical bottleneck is the lack of available analytical standards for most chemicals in the environment. Our study aims to explore a novel approach that integrates measurements of equilibrium partition ratios between organic solvents and water (K-SW) to predictions of molecular structures. These properties can be used as a fingerprint, which with the help of a machine learning algorithm can be converted into a series of functional groups (RDKit fragments), which can be used to search chemical databases. We conducted partitioning experiments using a chemical mixture containing 185 chemicals in 10 different organic solvents and water. Both a liquid chromatography quadrupole timeof-flight mass spectrometer (LC-QTOF MS) and a LC-Orbitrap MS were used to assess the feasibility of the experimental method and the accuracy of the algorithm at predicting the correct functional groups. The two methods showed differences in log KSW with the QTOF method showing a mean absolute error (MAE) of 0.22 and the Orbitrap method 0.33. The differences also culminated into errors in the predictions of RDKit fragments with the MAE for the QTOF method being 0.23 and for the Orbitrap method being 0.31. Our approach presents a new angle in structure elucidation for NTA and showed promise in assisting with compound identification.
引用
收藏
页码:14827 / 14838
页数:12
相关论文
共 31 条
[1]   In Silico Structure Predictions for Non-targeted Analysis: From Physicochemical Properties to Molecular Structures [J].
Abrahamsson, Dimitri ;
Siddharth, Adi ;
Young, Thomas M. ;
Sirota, Marina ;
Park, June-Soo ;
Martin, Jonathan W. ;
Woodruff, Tracey J. .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2022, 33 (07) :1134-1147
[2]   A Comprehensive Non-targeted Analysis Study of the Prenatal Exposome [J].
Abrahamsson, Dimitri Panagopoulos ;
Wang, Aolin ;
Jiang, Ting ;
Wang, Miaomiao ;
Siddharth, Adi ;
Morello-Frosch, Rachel ;
Park, June-Soo ;
Sirota, Marina ;
Woodruff, Tracey J. .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2021, 55 (15) :10542-10557
[3]  
Agilent, 2023, Superior Resolution of Agilent 6540 UHDQ-TOF over Thermo LTQ Orbitrap XL for Fast UHPLC Applications
[4]  
[Anonymous], 2023, BloodExposome Database
[5]  
[Anonymous], 2023, rdkit.Chem.FragmentsmoduleThe RDKit 2022.09.1 documentation
[6]  
[Anonymous], 2023, oecd-ilibrary.org
[7]  
[Anonymous], 2023, CFM-ID
[8]   Evaluation of polarity switching for untargeted lipidomics using liquid chromatography coupled to high resolution mass spectrometry [J].
Carlsson, Henrik ;
Vaivade, Aina ;
Khoonsari, Payam Emami ;
Burman, Joachim ;
Kultima, Kim .
JOURNAL OF CHROMATOGRAPHY B-ANALYTICAL TECHNOLOGIES IN THE BIOMEDICAL AND LIFE SCIENCES, 2022, 1195
[9]   Large-scale non-targeted metabolomic profiling in three human population-based studies [J].
Ganna, Andrea ;
Fall, Tove ;
Salihovic, Samira ;
Lee, Woojoo ;
Broeckling, Corey D. ;
Kumar, Jitender ;
Hagg, Sara ;
Stenemo, Markus ;
Magnusson, Patrik K. E. ;
Prenni, Jessica E. ;
Lind, Lars ;
Pawitan, Yudi ;
Ingelsson, Erik .
METABOLOMICS, 2016, 12 (01) :1-13
[10]  
Kaufmann A, 2017, RAPID COMMUN MASS SP, V31, P1915, DOI [10.1002/rcm.7981, 10.1002/rcm.7890]