Improving predictions of compound amenability for liquid chromatography–mass spectrometry to enhance non-targeted analysis

被引:0
作者
Nathaniel Charest
Charles N. Lowe
Christian Ramsland
Brian Meyer
Vicente Samano
Antony J. Williams
机构
[1] Center for Computational Toxicology and Exposure,
[2] Office of Research and Development,undefined
[3] U.S. Environmental Protection Agency,undefined
[4] Oakridge Associated Universities,undefined
[5] Senior Environmental Employment Program,undefined
[6] U.S. Environmental Protection Agency,undefined
来源
Analytical and Bioanalytical Chemistry | 2024年 / 416卷
关键词
Predictive modeling; Mass spectrometry; Non-targeted analysis; Suspect screening analysis;
D O I
暂无
中图分类号
学科分类号
摘要
Mass-spectrometry-based non-targeted analysis (NTA), in which mass spectrometric signals are assigned chemical identities based on a systematic collation of evidence, is a growing area of interest for toxicological risk assessment. Successful NTA results in better identification of potentially hazardous pollutants within the environment, facilitating the development of targeted analytical strategies to best characterize risks to human and ecological health. A supporting component of the NTA process involves assessing whether suspected chemicals are amenable to the mass spectrometric method, which is necessary in order to assign an observed signal to the chemical structure. Prior work from this group involved the development of a random forest model for predicting the amenability of 5517 unique chemical structures to liquid chromatography–mass spectrometry (LC-MS). This work improves the interpretability of the group’s prior model of the same endpoint, as well as integrating 1348 more data points across negative and positive ionization modes. We enhance interpretability by feature engineering, a machine learning practice that reduces the input dimensionality while attempting to preserve performance statistics. We emphasize the importance of interpretable machine learning models within the context of building confidence in NTA identification. The novel data were curated by the labeling of compounds as amenable or unamenable by expert curators, resulting in an enhanced set of chemical compounds to expand the applicability domain of the prior model. The balanced accuracy benchmark of the newly developed model is comparable to performance previously reported (mean CV BA is 0.84 vs. 0.82 in positive mode, and 0.85 vs. 0.82 in negative mode), while on a novel external set, derived from this work’s data, the Matthews correlation coefficients (MCC) for the novel models are 0.66 and 0.68 for positive and negative mode, respectively. Our group’s prior published models scored MCC of 0.55 and 0.54 on the same external sets. This demonstrates appreciable improvement over the chemical space captured by the expanded dataset. This work forms part of our ongoing efforts to develop models with higher interpretability and higher performance to support NTA efforts.
引用
收藏
页码:2565 / 2579
页数:14
相关论文
共 85 条
[1]  
Li L(2018)A Model for Risk-Based Screening and Prioritization of Human Exposure to Chemicals from Near-Field Sources Environ Sci Technol 52 14235-44
[2]  
Westgate JN(2014)SHEDS-HT: an integrated probabilistic exposure model for prioritizing exposures to chemicals with near-field and dietary sources Environ Sci Technol 48 12750-9
[3]  
Hughes L(2021)Predicting compound amenability with liquid chromatography-mass spectrometry to improve non-targeted analysis Analytical and Bioanalytical Chemistry. 413 7495-508
[4]  
Zhang X(2018)Integrating tools for non-targeted analysis research and chemical safety evaluations at the US EPA Journal of exposure science & environmental epidemiology. 28 411-26
[5]  
Givehchi B(2019)EPA’s non-targeted analysis collaborative trial (ENTACT): genesis, design, and initial findings Analytical and bioanalytical chemistry. 411 853-66
[6]  
Toose L(2007)The ToxCast program for prioritizing toxicity testing of environmental chemicals Toxicological sciences. 95 5-12
[7]  
Isaacs KK(2021)The Tox21 10K Compound Library: Collaborative Chemistry Advancing Toxicology Chemical Research in Toxicology. 34 189-216
[8]  
Glen WG(2019)Using prepared mixtures of ToxCast chemicals to evaluate non-targeted analysis (NTA) method performance Analytical and Bioanalytical Chemistry. 411 835-51
[9]  
Egeghy P(2018)Suspect screening and non-targeted analysis of drinking water using point-of-use filters Environmental pollution. 234 297-306
[10]  
Goldsmith M-R(2017)Open science for identifying “known unknown” chemicals Environmental science & technology. 51 5357-33