Cloud Parallel Processing of Tandem Mass Spectrometry Based Proteomics Data

被引:26
作者
Mohammed, Yassene [1 ,2 ,3 ]
Mostovenko, Ekaterina [1 ]
Henneman, Alex A. [1 ]
Marissen, Rob J. [1 ]
Deelder, Andre M. [1 ]
Palmblad, Magnus [1 ]
机构
[1] Leiden Univ, Dept Parasitol, Med Ctr, Biomol Mass Spectrometry Unit, NL-2300 RA Leiden, Netherlands
[2] Leibniz Univ Hannover, Distributed Comp Secur Grp, D-30167 Hannover, Germany
[3] Leibniz Univ Hannover, L3S, D-30167 Hannover, Germany
关键词
proteomics; mass spectrometry; scientific workflow; data decomposition; PEPTIDE IDENTIFICATION; SPECTRA; MAPREDUCE; SEQUENCES; XTANDEM; MS/MS; ETD;
D O I
10.1021/pr300561q
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Data analysis in mass spectrometry based proteomics struggles to keep pace with the advances in instrumentation and the increasing rate of data acquisition. Analyzing this data involves multiple steps requiring diverse software, using different algorithms and data formats. Speed and performance of the mass spectral search engines are continuously improving, although not necessarily as needed to face the challenges of acquired big data. Improving and parallelizing the search algorithms is one possibility; data decomposition presents another, simpler strategy for introducing parallelism. We describe a general method for parallelizing identification of tandem mass spectra using data decomposition that keeps the search engine intact and wraps the parallelization around it. We introduce two algorithms for decomposing mzXML files and recomposing resulting pepXML files. This makes the approach applicable to different search engines, including those relying on sequence databases and those searching spectral libraries. We use cloud computing to deliver the computational power and scientific workflow engines to interface and automate the different processing steps. We show how to leverage these technologies to achieve faster data analysis in proteomics and present three scientific workflows for parallel database as well as spectral library search using our data decomposition programs, X!Tandem and SpectraST.
引用
收藏
页码:5101 / 5108
页数:8
相关论文
共 50 条
  • [41] Reinvestigating the Correctness of Decoy-Based False Discovery Rate Control in Proteomics Tandem Mass Spectrometry
    Freestone, Jack
    Noble, William Stafford
    Keich, Uri
    JOURNAL OF PROTEOME RESEARCH, 2024, 23 (06) : 1907 - 1914
  • [42] Mining Large Scale Tandem Mass Spectrometry Data for Protein Modifications Using Spectral Libraries
    Horlacher, Oliver
    Lisacek, Frederique
    Mueller, Markus
    JOURNAL OF PROTEOME RESEARCH, 2016, 15 (03) : 721 - 731
  • [43] Mass Spectrometry-Based Proteomics for the Analysis of Chromatin Structure and Dynamics
    Soldi, Monica
    Cuomo, Alessandro
    Bremang, Michael
    Bonaldi, Tiziana
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2013, 14 (03) : 5402 - 5431
  • [44] Interpretation of mass spectrometry data for high-throughput proteomics
    Daniel C. Chamrad
    Gerhard Koerting
    Johan Gobom
    Herbert Thiele
    Joachim Klose
    Helmut E. Meyer
    Martin Blueggel
    Analytical and Bioanalytical Chemistry, 2003, 376 : 1014 - 1022
  • [45] Interpretation of mass spectrometry data for high-throughput proteomics
    Chamrad, DC
    Koerting, G
    Gobom, J
    Thiele, H
    Klose, J
    Meyer, HE
    Blueggel, M
    ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2003, 376 (07) : 1014 - 1022
  • [46] Mass spectrometry-based proteomics in cancer research
    Cho, William C.
    EXPERT REVIEW OF PROTEOMICS, 2017, 14 (09) : 725 - 727
  • [47] Quality control in mass spectrometry-based proteomics
    Bittremieux, Wout
    Tabb, David L.
    Impens, Francis
    Staes, An
    Timmerman, Evy
    Martens, Lennart
    Laukens, Kris
    MASS SPECTROMETRY REVIEWS, 2018, 37 (05) : 697 - 711
  • [48] Web Resources for Mass Spectrometry-based Proteomics
    Chen, Tao
    Zhao, Jie
    Ma, Jie
    Zhu, Yunping
    GENOMICS PROTEOMICS & BIOINFORMATICS, 2015, 13 (01) : 36 - 39
  • [49] A statistical approach to peptide identification from clustered tandem mass spectrometry data
    Ryu, Soyoung
    Goodlett, David R.
    Noble, William S.
    Minin, Vladimir N.
    2012 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2012,
  • [50] A bioinformatics approach for mass spectrometry data processing: Applications to proteomics and small molecule analysis
    Sonderegger, M
    Staniszewski, K
    Meyers, A
    Siuzdak, G
    SPECTROSCOPY-AN INTERNATIONAL JOURNAL, 2002, 16 (02): : 81 - 87