Graph Convolutional Networks for Improved Prediction and Interpretability of Chromatographic Retention Data

被引:37
作者
Kensert, Alexander [1 ,2 ]
Bouwmeester, Robbin [3 ,4 ]
Efthymiadis, Kyriakos [1 ,5 ]
Van Broeck, Peter [6 ]
Desmet, Gert [2 ]
Cabooter, Deirdre [1 ]
机构
[1] Univ Leuven, Dept Pharmaceut & Pharmacol Sci, KU Leuven, B-3000 Leuven, Belgium
[2] Vrije Univ Brussel, Dept Chem Engn, B-1050 Brussels, Belgium
[3] VIB, VIB UGent Ctr Med Biotechnol, B-9052 Ghent, Belgium
[4] Univ Ghent, Dept Biomol Med, B-9052 Ghent, Belgium
[5] Vrije Univ Brussel, Dept Comp Sci, Artificial Intelligence Lab, B-1050 Brussels, Belgium
[6] Janssen Pharmaceut, Dept Pharmaceut Dev & Mfg Sci, B-2340 Beerse, Belgium
关键词
HYDROPHILIC-INTERACTION CHROMATOGRAPHY; INTERACTION LIQUID-CHROMATOGRAPHY; SEPARATION; TIME; MECHANISM; DESIGN; PHASE;
D O I
10.1021/acs.analchem.1c02988
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Machine learning is a popular technique to predict the retention times of molecules based on descriptors. Descriptors and associated labels (e.g., retention times) of a set of molecules can be used to train a machine learning algorithm. However, descriptors are fixed molecular features which are not necessarily optimized for the given machine learning problem (e.g., to predict retention times). Recent advances in molecular machine learning make use of so-called graph convolutional networks (GCNs) to learn molecular representations from atoms and their bonds to adjacent atoms to optimize the molecular representation for the given problem. In this study, two GCNs were implemented to predict the retention times of molecules for three different chromatographic data sets and compared to seven benchmarks (including two state-of-the art machine learning models). Additionally, saliency maps were computed from trained GCNs to better interpret the importance of certain molecular sub-structures in the data sets. Based on the overall observations of this study, the GCNs performed better than all benchmarks, either significantly outperforming them (5-25% lower mean absolute error) or performing similar to them (<5% difference). Saliency maps revealed a significant difference in molecular sub-structures that are important for predictions of different chromatographic data sets (reversed-phase liquid chromatography vs hydrophilic interaction liquid chromatography).
引用
收藏
页码:15633 / 15641
页数:9
相关论文
共 49 条
[1]   Retention Time Prediction Improves Identification in Nontargeted Lipidomics Approaches [J].
Aicheler, Fabian ;
Li, Jia ;
Hoene, Miriam ;
Lehmann, Rainer ;
Xu, Guowang ;
Kohlbacher, Oliver .
ANALYTICAL CHEMISTRY, 2015, 87 (15) :7698-7704
[2]   HYDROPHILIC-INTERACTION CHROMATOGRAPHY FOR THE SEPARATION OF PEPTIDES, NUCLEIC-ACIDS AND OTHER POLAR COMPOUNDS [J].
ALPERT, AJ .
JOURNAL OF CHROMATOGRAPHY, 1990, 499 :177-196
[3]  
[Anonymous], 2015, ACS SYM SER
[4]   Structure Annotation of All Mass Spectra in Untargeted Metabolomics [J].
Blazenovic, Ivana ;
Kind, Tobias ;
Sa, Michael R. ;
Ji, Jian ;
Vaniya, Arpana ;
Wancewicz, Benjamin ;
Roberts, Bryan S. ;
Torbasinovic, Hrvoje ;
Lee, Tack ;
Mehta, Sajjan S. ;
Showalter, Megan R. ;
Song, Hosook ;
Kwok, Jessica ;
Jahn, Dieter ;
Kim, Jayoung ;
Fiehn, Oliver .
ANALYTICAL CHEMISTRY, 2019, 91 (03) :2155-2162
[5]   Retip: Retention Time Prediction for Compound Annotation in Untargeted Metabolomics [J].
Bonini, Paolo ;
Kind, Tobias ;
Tsugawa, Hiroshi ;
Barupal, Dinesh Kumar ;
Fiehn, Oliver .
ANALYTICAL CHEMISTRY, 2020, 92 (11) :7515-7522
[6]   Comprehensive and Empirical Evaluation of Machine Learning Algorithms for Small Molecule LC Retention Time Prediction [J].
Bouwmeester, Robbin ;
Martens, Lennart ;
Degroeve, Sven .
ANALYTICAL CHEMISTRY, 2019, 91 (05) :3694-3703
[7]  
Cabooter D, 2017, LC GC N AM, V29, P240
[8]  
Callaway Ewen, 2020, Nature, DOI 10.1038/d41586-020-03166-8
[9]   Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction [J].
Coley, Connor W. ;
Barzilay, Regina ;
Green, William H. ;
Jaakkola, Tommi S. ;
Jensen, Klavs F. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2017, 57 (08) :1757-1772
[10]   The METLIN small molecule dataset for machine learning-based retention time prediction [J].
Domingo-Almenara, Xavier ;
Guijas, Carlos ;
Billings, Elizabeth ;
Montenegro-Burke, J. Rafael ;
Uritboonthai, Winnie ;
Aisporna, Aries E. ;
Chen, Emily ;
Benton, H. Paul ;
Siuzdak, Gary .
NATURE COMMUNICATIONS, 2019, 10 (1)