Evaluating the generalizability of graph neural networks for predicting collision cross section

被引:0
作者
Engler Hart, Chloe [1 ]
Preto, Antonio Jose [1 ]
Chanana, Shaurya [1 ]
Healey, David [1 ]
Kind, Tobias [1 ]
Domingo-Fernandez, Daniel [1 ]
机构
[1] Enveda Biosci Inc, 5700 Flatiron Pkwy, Boulder, CO 80301 USA
来源
JOURNAL OF CHEMINFORMATICS | 2024年 / 16卷 / 01期
关键词
APPLICABILITY DOMAIN;
D O I
10.1186/s13321-024-00899-w
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Ion Mobility coupled with Mass Spectrometry (IM-MS) is a promising analytical technique that enhances molecular characterization by measuring collision cross-section (CCS) values, which are indicative of the molecular size and shape. However, the effective application of CCS values in structural analysis is still constrained by the limited availability of experimental data, necessitating the development of accurate machine learning (ML) models for in silico predictions. In this study, we evaluated state-of-the-art Graph Neural Networks (GNNs), trained to predict CCS values using the largest publicly available dataset to date. Although our results confirm the high accuracy of these models within chemical spaces similar to their training environments, their performance significantly declines when applied to structurally novel regions. This discrepancy raises concerns about the reliability of in silico CCS predictions and underscores the need for releasing further publicly available CCS datasets. To mitigate this, we introduce Mol2CCS which demonstrates how generalization can be partially improved by extending models to account for additional features such as molecular fingerprints, descriptors, and the molecule types. Lastly, we also show how confidence models can support by enhancing the reliability of the CCS estimates.Scientific contributionWe have benchmarked state-of-the-art graph neural networks for predicting collision cross section. Our work highlights the accuracy of these models when trained and predicted in similar chemical spaces, but also how their accuracy drops when evaluated in structurally novel regions. Lastly, we conclude by presenting potential approaches to mitigate this issue.
引用
收藏
页数:11
相关论文
共 23 条
  • [1] METLIN-CCS Lipid Database: An authentic standards resource for lipid classification and identification
    Baker, Erin S.
    Uritboonthai, Winnie
    Aisporna, Aries
    Hoang, Corey
    Heyman, Heino M.
    Connell, Lisa
    Olivier-Jimenez, Damien
    Giera, Martin
    Siuzdak, Gary
    [J]. NATURE METABOLISM, 2024, 6 (06) : 981 - 982
  • [2] METLIN-CCS: an ion mobility spectrometry collision cross section database
    Baker, Erin S.
    Hoang, Corey
    Uritboonthai, Winnie
    Heyman, Heino M.
    Pratt, Brian
    MacCoss, Michael
    MacLean, Brendan
    Plumb, Robert
    Aisporna, Aries
    Siuzdak, Gary
    [J]. NATURE METHODS, 2023, 20 (12) : 1836 - 1837
  • [3] The properties of known drugs .1. Molecular frameworks
    Bemis, GW
    Murcko, MA
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (15) : 2887 - 2893
  • [4] In Silico Collision Cross Section Calculations to Aid Metabolite Annotation
    Das, Susanta
    Tanemura, Kiyoto Aramis
    Dinpazhoh, Laleh
    Keng, Mithony
    Schumm, Christina
    Leahy, Lydia
    Asef, Carter K.
    Rainey, Markace
    Edison, Arthur S.
    Fernandez, Facundo M.
    Merz, Kenneth M., Jr.
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2022, 33 (05) : 750 - 759
  • [5] Highly accurate and large-scale collision cross sections prediction with graph neural networks
    Guo, Renfeng
    Zhang, Youjia
    Liao, Yuxuan
    Yang, Qiong
    Xie, Ting
    Fan, Xiaqiong
    Lin, Zhonglong
    Chen, Yi
    Lu, Hongmei
    Zhang, Zhimin
    [J]. COMMUNICATIONS CHEMISTRY, 2023, 6 (01)
  • [6] Predicting the Predictability: A Unified Approach to the Applicability Domain Problem of QSAR Models
    Horvath, Dragos
    Marcou, Gilles
    Alexandre, Varnek
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (07) : 1762 - 1776
  • [7] Ion mobility-mass spectrometry
    Kanu, Abu B.
    Dwivedi, Prabha
    Tam, Maggie
    Matz, Laura
    Hill, Herbert H., Jr.
    [J]. JOURNAL OF MASS SPECTROMETRY, 2008, 43 (01): : 1 - 22
  • [8] Landrum Greg, 2022, Zenodo
  • [9] Collision Cross Section Prediction Based on Machine Learning
    Li, Xiaohang
    Wang, Hongda
    Jiang, Meiting
    Ding, Mengxiang
    Xu, Xiaoyan
    Xu, Bei
    Zou, Yadan
    Yu, Yuetong
    Yang, Wenzhi
    [J]. MOLECULES, 2023, 28 (10):
  • [10] Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes
    Luque Ruiz, Irene
    Angel Gomez-Nieto, Miguel
    [J]. MOLECULES, 2018, 23 (11):