Cloud Parallel Processing of Tandem Mass Spectrometry Based Proteomics Data

被引:26
|
作者
Mohammed, Yassene [1 ,2 ,3 ]
Mostovenko, Ekaterina [1 ]
Henneman, Alex A. [1 ]
Marissen, Rob J. [1 ]
Deelder, Andre M. [1 ]
Palmblad, Magnus [1 ]
机构
[1] Leiden Univ, Dept Parasitol, Med Ctr, Biomol Mass Spectrometry Unit, NL-2300 RA Leiden, Netherlands
[2] Leibniz Univ Hannover, Distributed Comp Secur Grp, D-30167 Hannover, Germany
[3] Leibniz Univ Hannover, L3S, D-30167 Hannover, Germany
关键词
proteomics; mass spectrometry; scientific workflow; data decomposition; PEPTIDE IDENTIFICATION; SPECTRA; MAPREDUCE; SEQUENCES; XTANDEM; MS/MS; ETD;
D O I
10.1021/pr300561q
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Data analysis in mass spectrometry based proteomics struggles to keep pace with the advances in instrumentation and the increasing rate of data acquisition. Analyzing this data involves multiple steps requiring diverse software, using different algorithms and data formats. Speed and performance of the mass spectral search engines are continuously improving, although not necessarily as needed to face the challenges of acquired big data. Improving and parallelizing the search algorithms is one possibility; data decomposition presents another, simpler strategy for introducing parallelism. We describe a general method for parallelizing identification of tandem mass spectra using data decomposition that keeps the search engine intact and wraps the parallelization around it. We introduce two algorithms for decomposing mzXML files and recomposing resulting pepXML files. This makes the approach applicable to different search engines, including those relying on sequence databases and those searching spectral libraries. We use cloud computing to deliver the computational power and scientific workflow engines to interface and automate the different processing steps. We show how to leverage these technologies to achieve faster data analysis in proteomics and present three scientific workflows for parallel database as well as spectral library search using our data decomposition programs, X!Tandem and SpectraST.
引用
收藏
页码:5101 / 5108
页数:8
相关论文
共 50 条
  • [21] Added value for tandem mass spectrometry shotgun proteomics data validation through isoelectric focusing of peptides
    Heller, M
    Ye, ML
    Michel, PE
    Morier, P
    Stalder, D
    Jünger, MA
    Aebersold, R
    Reymond, FR
    Rossier, JS
    JOURNAL OF PROTEOME RESEARCH, 2005, 4 (06) : 2273 - 2282
  • [22] MzJava']Java: An open source library for mass spectrometry data processing
    Horlacher, Oliver
    Nikitin, Frederic
    Alocci, Davide
    Mariethoz, Julien
    Mueller, Markus
    Lisacek, Frederique
    JOURNAL OF PROTEOMICS, 2015, 129 : 63 - 70
  • [23] Discovery of Protein Modifications Using Differential Tandem Mass Spectrometry Proteomics
    Cifani, Paolo
    Li, Zhi
    Luo, Danmeng
    Grivainis, Mark
    Intlekofer, Andrew M.
    Fenyo, David
    Kentsis, Alex
    JOURNAL OF PROTEOME RESEARCH, 2021, 20 (04) : 1835 - 1848
  • [24] A review of statistical methods for protein identification using tandem mass spectrometry
    Serang, Oliver
    Noble, William
    STATISTICS AND ITS INTERFACE, 2012, 5 (01) : 3 - 20
  • [25] Annotation of tandem mass spectrometry data using stochastic neural networks in shotgun proteomics
    Sulimov, Pavel
    Voronkova, Anastasia
    Kertesz-Farkas, Attila
    BIOINFORMATICS, 2020, 36 (12) : 3781 - 3787
  • [26] Mass Spectrometry Bioinformatics: Tools for Navigating the Proteomics Landscape
    Blackburn, Kevin
    Goshe, Michael B.
    CURRENT ANALYTICAL CHEMISTRY, 2009, 5 (02) : 131 - 143
  • [27] Sharing mass spectrometry data in a grid-based distributed proteomics laboratory
    Veltri, P.
    Cannataro, M.
    Tradigo, G.
    INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (03) : 577 - 591
  • [28] Peptide identification in "shotgun" proteomics using tandem mass spectrometry: Comparison of search engine algorithms
    Ivanov, M. V.
    Levitsky, L. I.
    Lobas, A. A.
    Tarasova, I. A.
    Pridatchenko, M. L.
    Zgoda, V. G.
    Moshkovskii, S. A.
    Mitulovic, G.
    Gorshkov, M. V.
    JOURNAL OF ANALYTICAL CHEMISTRY, 2015, 70 (14) : 1614 - 1619
  • [29] Data-Independent Acquisition Mass Spectrometry-Based Proteomics and Software Tools: A Glimpse in 2020
    Zhang, Fangfei
    Ge, Weigang
    Ruan, Guan
    Cai, Xue
    Guo, Tiannan
    PROTEOMICS, 2020, 20 (17-18)
  • [30] Machine Learning for Mass Spectrometry Data Analysis in Proteomics
    Li, Juntao
    Zhou, Kanglei
    Mu, Bingyu
    CURRENT PROTEOMICS, 2021, 18 (05) : 620 - 634