Liquid-chromatography retention order prediction for metabolite identification

被引:44
作者
Bach, Eric [1 ]
Szedmak, Sandor [1 ]
Brouard, Celine [1 ]
Boecker, Sebastian [2 ]
Rousu, Juho [1 ]
机构
[1] Aalto Univ, HIIT, Dept Comp Sci, Espoo 00076, Finland
[2] Friedrich Schiller Univ, Chair Bioinformat, Dept Comp Sci, D-07743 Jena, Germany
基金
芬兰科学院;
关键词
TIME PREDICTION; FRAGMENTATION; PERFORMANCE;
D O I
10.1093/bioinformatics/bty590
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Liquid Chromatography (LC) followed by tandem Mass Spectrometry (MS/MS) is one of the predominant methods for metabolite identification. In recent years, machine learning has started to transform the analysis of tandem mass spectra and the identification of small molecules. In contrast, LC data is rarely used to improve metabolite identification, despite numerous published methods for retention time prediction using machine learning. Results: We present a machine learning method for predicting the retention order of molecules; that is, the order in which molecules elute from the LC column. Our method has important advantages over previous approaches: We show that retention order is much better conserved between instruments than retention time. To this end, our method can be trained using retention time measurements from different LC systems and configurations without tedious pre-processing, significantly increasing the amount of available training data. Our experiments demonstrate that retention order prediction is an effective way to learn retention behaviour of molecules from heterogeneous retention time data. Finally, we demonstrate how retention order prediction and MS/MS-based scores can be combined for more accurate metabolite identifications when analyzing a complete LC-MS/MS run.
引用
收藏
页码:875 / 883
页数:9
相关论文
共 33 条
  • [1] Retention Time Prediction Improves Identification in Nontargeted Lipidomics Approaches
    Aicheler, Fabian
    Li, Jia
    Hoene, Miriam
    Lehmann, Rainer
    Xu, Guowang
    Kohlbacher, Oliver
    [J]. ANALYTICAL CHEMISTRY, 2015, 87 (15) : 7698 - 7704
  • [2] Global chemical analysis of biology by mass spectrometry
    Aksenov, Alexander A.
    da Silva, Ricardo
    Knight, Rob
    Lopes, Norberto P.
    Dorrestein, Pieter C.
    [J]. NATURE REVIEWS CHEMISTRY, 2017, 1 (07)
  • [3] CFM-ID: a web server for annotation, spectrum prediction and metabolite identification from tandem mass spectra
    Allen, Felicity
    Pon, Allison
    Wilson, Michael
    Greiner, Russ
    Wishart, David
    [J]. NUCLEIC ACIDS RESEARCH, 2014, 42 (W1) : W94 - W99
  • [4] [Anonymous], 2007, DYNAMIC PROGRAMMING
  • [5] [Anonymous], 2002, P ACM SIGKDD KDD 200
  • [6] [Anonymous], 2016, DYNAMIC PROGRAMMING
  • [7] Bellman R. E., 1957, Dynamic programming. Princeton landmarks in mathematics
  • [8] Brouard C., 2017, P MACHINE LEARNING R, V77, P407
  • [9] Fast metabolite identification with Input Output Kernel Regression
    Brouard, Celine
    Shen, Huibin
    Duehrkop, Kai
    d'Alche-Buc, Florence
    Boecker, Sebastian
    Rousu, Juho
    [J]. BIOINFORMATICS, 2016, 32 (12) : 28 - 36
  • [10] Toward Global Metabolomics Analysis with Hydrophilic Interaction Liquid Chromatography-Mass Spectrometry: Improved Metabolite Identification by Retention Time Prediction
    Creek, Darren J.
    Jankevics, Andris
    Breitling, Rainer
    Watson, David G.
    Barrett, Michael P.
    Burgess, Karl E. V.
    [J]. ANALYTICAL CHEMISTRY, 2011, 83 (22) : 8703 - 8710