Predicting retention time in hydrophilic interaction liquid chromatography mass spectrometry and its use for peak annotation in metabolomics

被引:83
作者
Cao, Mingshu [1 ]
Fraser, Karl [1 ]
Huege, Jan [1 ]
Featonby, Tom [1 ]
Rasmussen, Susanne [2 ]
Jones, Chris [1 ]
机构
[1] AgRes Grasslands Res Ctr, Palmerston North 4442, New Zealand
[2] Massey Univ, Inst Agr & Environm, Palmerston North, New Zealand
关键词
QSRR; LCMS; Metabolomics; Peak annotation; Metabolite identification; Lolium perenne; IDENTIFICATION; SMILES;
D O I
10.1007/s11306-014-0727-x
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Liquid chromatography coupled to mass spectrometry (LCMS) is widely used in metabolomics due to its sensitivity, reproducibility, speed and versatility. Metabolites are detected as peaks which are characterised by mass-over-charge ratio (m/z) and retention time (rt), and one of the most critical but also the most challenging tasks in metabolomics is to annotate the large number of peaks detected in biological samples. Accurate m/z measurements enable the prediction of molecular formulae which provide clues to the chemical identity of peaks, but often a number of metabolites have identical molecular formulae. Chromatographic behaviour, reflecting the physicochemical properties of metabolites, should also provide structural information. However, the variation in rt between analytical runs, and the complicating factors underlying the observed time shifts, make the use of such information for peak annotation a non-trivial task. To this end, we conducted Quantitative Structure-Retention Relationship (QSRR) modelling between the calculated molecular descriptors (MDs) and the experimental retention times (rts) of 93 authentic compounds analysed using hydrophilic interaction liquid chromatography (HILIC) coupled to high resolution MS. A predictive QSRR model based on Random Forests algorithm outperformed a Multiple Linear Regression based model, and achieved a high correlation between predicted rts and experimental rts (Pearson's correlation coefficient = 0.97), with mean and median absolute error of 0.52 min and 0.34 min (corresponding to 5.1 and 3.2 % error), respectively. We demonstrate that rt prediction with the precision achieved enables the systematic utilisation of rts for annotating unknown peaks detected in a metabolomics study. The application of the QSRR model with the strategy we outlined enhanced the peak annotation process by reducing the number of false positives resulting from database queries by matching accurate mass alone, and enriching the reference library. The predicted rts were validated using either authentic compounds or ion fragmentation patterns.
引用
收藏
页码:696 / 706
页数:11
相关论文
共 44 条
[1]  
[Anonymous], P NATL ACAD SCI US
[2]  
[Anonymous], 2008, Handbook of Molecular Descriptors
[3]   A study on retention "projection" as a supplementary means for compound identification by liquid chromatography-mass spectrometry capable of predicting retention with different gradients, flow rates, and instruments [J].
Boswell, Paul G. ;
Schellenberg, Jonathan R. ;
Carr, Peter W. ;
Cohen, Jerry D. ;
Hegeman, Adrian D. .
JOURNAL OF CHROMATOGRAPHY A, 2011, 1218 (38) :6732-6741
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   Statistical modeling: The two cultures [J].
Breiman, L .
STATISTICAL SCIENCE, 2001, 16 (03) :199-215
[6]   Computational Analyses of Spectral Trees from Electrospray Multi-Stage Mass Spectrometry to Aid Metabolite Identification [J].
Cao, Mingshu ;
Fraser, Karl ;
Rasmussen, Susanne .
METABOLITES, 2013, 3 (04) :1036-1050
[7]   Computation of octanol-water partition coefficients by guiding an additive model with knowledge [J].
Cheng, Tiejun ;
Zhao, Yuan ;
Li, Xun ;
Lin, Fu ;
Xu, Yong ;
Zhang, Xinglong ;
Li, Yan ;
Wang, Renxiao ;
Lai, Luhua .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2007, 47 (06) :2140-2148
[8]   Toward Global Metabolomics Analysis with Hydrophilic Interaction Liquid Chromatography-Mass Spectrometry: Improved Metabolite Identification by Retention Time Prediction [J].
Creek, Darren J. ;
Jankevics, Andris ;
Breitling, Rainer ;
Watson, David G. ;
Barrett, Michael P. ;
Burgess, Karl E. V. .
ANALYTICAL CHEMISTRY, 2011, 83 (22) :8703-8710
[9]   METABOLOMIC APPLICATIONS OF HILIC-LC-MS [J].
Cubbon, Simon ;
Antonio, Carla ;
Wilson, Julie ;
Thomas-Oates, Jane .
MASS SPECTROMETRY REVIEWS, 2010, 29 (05) :671-684
[10]   Metabolite signal identification in accurate mass metabolomics data with MZedDB, an interactive m/z annotation tool utilising predicted ionisation behaviour 'rules' [J].
Draper, John ;
Enot, David P. ;
Parker, David ;
Beckmann, Manfred ;
Snowdon, Stuart ;
Lin, Wanchang ;
Zubair, Hassan .
BMC BIOINFORMATICS, 2009, 10