Bypassing the Identification: MS2Quant for Concentration Estimations of Chemicals Detected with Nontarget LC-HRMS from MS2 Data

被引:18
|
作者
Sepman, Helen [1 ,2 ]
Malm, Louise [1 ]
Peets, Pilleriin [1 ]
MacLeod, Matthew [2 ]
Martin, Jonathan [3 ]
Breitholtz, Magnus [2 ]
Kruve, Anneli [1 ,2 ]
机构
[1] Stockholm Univ, Dept Mat & Environm Chem, S-10691 Stockholm, Sweden
[2] Stockholm Univ, Dept Environm Sci, S-10691 Stockholm, Sweden
[3] Stockholm Univ, Dept Environm Sci, Sci Life Lab, S-10691 Stockholm, Sweden
基金
瑞典研究理事会;
关键词
ELECTROSPRAY-IONIZATION EFFICIENCY; RESOLUTION MASS-SPECTROMETRY; MOBILE-PHASE; SEMI-QUANTIFICATION; WATER; CONTAMINANTS; SUBSTANCES; PREDICTION; PARAMETERS; SUSPECT;
D O I
10.1021/acs.analchem.3c01744
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Nontarget analysisby liquid chromatography-high-resolutionmass spectrometry (LC-HRMS) is now widely used to detect pollutantsin the environment. Shifting away from targeted methods has led todetection of previously unseen chemicals, and assessing the risk posedby these newly detected chemicals is an important challenge. Assessingexposure and toxicity of chemicals detected with nontarget HRMS ishighly dependent on the knowledge of the structure of the chemical.However, the majority of features detected in nontarget screeningremain unidentified and therefore the risk assessment with conventionaltools is hampered. Here, we developed MS2Quant, a machine learningmodel that enables prediction of concentration from fragmentation(MS2) spectra of detected, but unidentified chemicals.MS2Quant is an xgbTree algorithm-based regressionmodel developed using ionization efficiency data for 1191 unique chemicalsthat spans 8 orders of magnitude. The ionization efficiency valuesare predicted from structural fingerprints that can be computed fromthe SMILES notation of the identified chemicals or from MS2 spectra of unidentified chemicals using SIRIUS+CSI:FingerID software.The root mean square errors of the training and test sets were 0.55(3.5x) and 0.80 (6.3x) log-units, respectively. In comparison,ionization efficiency prediction approaches that depend on assigningan unequivocal structure typically yield errors from 2x to 6x.The MS2Quant quantification model was validated on a set of 39 environmentalpollutants and resulted in a mean prediction error of 7.4x, ageometric mean of 4.5x, and a median of 4.0x. For comparison,a model based on PaDEL descriptors that depends on unequivocal structuralassignment was developed using the same dataset. The latter approachyielded a comparable mean prediction error of 9.5x, a geometricmean of 5.6x, and a median of 5.2x on the validation setchemicals when the top structural assignment was used as input. Thisconfirms that MS2Quant enables to extract exposure information forunidentified chemicals which, although detected, have thus far beendisregarded due to lack of accurate tools for quantification. TheMS2Quant model is available as an R-package in GitHub for improvingdiscovery and monitoring of potentially hazardous environmental pollutantswith nontarget screening.
引用
收藏
页码:12329 / 12338
页数:10
相关论文
共 49 条
  • [41] Identification of phytocompounds from Houttuynia cordata Thunb. as potential inhibitors for SARS-CoV-2 replication proteins through GC–MS/LC–MS characterization, molecular docking and molecular dynamics simulation
    Sanjib Kumar Das
    Saurov Mahanta
    Bhaben Tanti
    Hui Tag
    Pallabi Kalita Hui
    Molecular Diversity, 2022, 26 : 365 - 388
  • [42] In silico identification of SARS-CoV-2 spike (S) protein-ACE2 complex inhibitors from eight Tecoma species and cultivars analyzed by LC-MS
    El Hawary, Seham S.
    Khattab, Amira R.
    Marzouk, Hanan S.
    El Senousy, Amira S.
    Alex, Mariam G. A.
    Aly, Omar M.
    Teleb, Mohamed
    Abdelmohsen, Usama Ramadan
    RSC ADVANCES, 2020, 10 (70) : 43103 - 43108
  • [43] Identification of phytocompounds from Houttuynia cordata Thunb. as potential inhibitors for SARS-CoV-2 replication proteins through GC-MS/LC-MS characterization, molecular docking and molecular dynamics simulation
    Das, Sanjib Kumar
    Mahanta, Saurov
    Tanti, Bhaben
    Tag, Hui
    Hui, Pallabi Kalita
    MOLECULAR DIVERSITY, 2022, 26 (01) : 365 - 388
  • [44] Cucumariosides F1 and F2, two new triterpene glycosides from the sea cucumber Eupentacta fraudatrix and their LC-ESI MS/MS identification in the starfish Patiria pectinifera, a predator of the sea cucumber
    Popov, Roman S.
    Avilov, Sergey A.
    Silchenko, Alexandra S.
    Kalinovsky, Anatoly I.
    Dmitrenok, Pavel S.
    Grebnev, Boris B.
    Ivanchina, Natalia V.
    Kalinin, Vladimir I.
    BIOCHEMICAL SYSTEMATICS AND ECOLOGY, 2014, 57 : 191 - 197
  • [45] Baldwin and Whitehead's Manzamine Alkaloids Biosynthesis Hypothesis Involves a Finely Tuned Reactivity of Acrolein: Automated Extraction of Reactivity Patterns from LC-MS2 Data
    Leblond, Axel
    Nguyen, Alexandre
    Alcover, Charlotte
    Leblanc, Karine
    Gallard, Jean-Francois
    Joseph, Delphine
    Poupon, Erwan
    Beniddir, Mehdi A.
    ORGANIC LETTERS, 2024, 26 (11) : 2163 - 2168
  • [46] Moving pieces in a taxonomic puzzle: Venom 2D-LC/MS and data clustering analyses to infer phylogenetic relationships in some scorpions from the Buthidae family (Scorpiones)
    Nascimento, Danielle G.
    Rates, Breno
    Santos, Daniel M.
    Verano-Braga, Thiago
    Barbosa-Silva, Adriano
    Dutra, Alexandre A. A.
    Biondi, Ilka
    Martin-Eauclaire, Marie France
    De Lima, Maria Elena
    Pimenta, Adriano M. C.
    TOXICON, 2006, 47 (06) : 628 - 639
  • [47] Identification and characterization of DTX-5c and 7-hydroxymethyl-2-methylene-octa-4,7-dienyl okadaate from Prorocentrum belizeanum cultures by LC-MS
    Paz, Beatriz
    Daranas, Antonio H.
    Cruz, Patricia G.
    Franco, Jose M.
    Napolitano, Jose G.
    Norte, Manuel
    Fernandez, Jose J.
    TOXICON, 2007, 50 (04) : 470 - 478
  • [48] Enhanced identification of the in vivo metabolites of Ecliptae Herba in rat plasma by integrating untargeted data-dependent MS2 and predictive multiple reaction monitoring-information dependent acquisition-enhanced product ion scan
    Li, Mengrong
    Si, Dandan
    Fu, Zhifei
    Sang, Mangmang
    Zhang, Zixin
    Liu, Erwei
    Yang, Wenzhi
    Gao, Xiumei
    Han, Lifeng
    JOURNAL OF CHROMATOGRAPHY B-ANALYTICAL TECHNOLOGIES IN THE BIOMEDICAL AND LIFE SCIENCES, 2019, 1109 : 99 - 111
  • [49] Identification of cancer inhibitors from Hystrix brachyura bezoar extracts using LC-MS multivariate data analysis and in silico evaluation on Bcl-2, cyclin B/CDK1, VEGF and NM23-H1
    Khan, Al' aina Yuhainis Firus
    Ahmed, Qamar Uddin
    Khatib, Alfi
    Ibrahim, Zalikha
    Nipun, Tanzina Sharmin
    Natto, Hatim Abdullah
    Saiman, Mohd Zuwairi
    Zakaria, Zainul Amiruddin
    Wahab, Ridhwan Abdul
    BOLETIN LATINOAMERICANO Y DEL CARIBE DE PLANTAS MEDICINALES Y AROMATICAS, 2024, 23 (01): : 41 - 60