Cloud Parallel Processing of Tandem Mass Spectrometry Based Proteomics Data

被引:26
|
作者
Mohammed, Yassene [1 ,2 ,3 ]
Mostovenko, Ekaterina [1 ]
Henneman, Alex A. [1 ]
Marissen, Rob J. [1 ]
Deelder, Andre M. [1 ]
Palmblad, Magnus [1 ]
机构
[1] Leiden Univ, Dept Parasitol, Med Ctr, Biomol Mass Spectrometry Unit, NL-2300 RA Leiden, Netherlands
[2] Leibniz Univ Hannover, Distributed Comp Secur Grp, D-30167 Hannover, Germany
[3] Leibniz Univ Hannover, L3S, D-30167 Hannover, Germany
关键词
proteomics; mass spectrometry; scientific workflow; data decomposition; PEPTIDE IDENTIFICATION; SPECTRA; MAPREDUCE; SEQUENCES; XTANDEM; MS/MS; ETD;
D O I
10.1021/pr300561q
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Data analysis in mass spectrometry based proteomics struggles to keep pace with the advances in instrumentation and the increasing rate of data acquisition. Analyzing this data involves multiple steps requiring diverse software, using different algorithms and data formats. Speed and performance of the mass spectral search engines are continuously improving, although not necessarily as needed to face the challenges of acquired big data. Improving and parallelizing the search algorithms is one possibility; data decomposition presents another, simpler strategy for introducing parallelism. We describe a general method for parallelizing identification of tandem mass spectra using data decomposition that keeps the search engine intact and wraps the parallelization around it. We introduce two algorithms for decomposing mzXML files and recomposing resulting pepXML files. This makes the approach applicable to different search engines, including those relying on sequence databases and those searching spectral libraries. We use cloud computing to deliver the computational power and scientific workflow engines to interface and automate the different processing steps. We show how to leverage these technologies to achieve faster data analysis in proteomics and present three scientific workflows for parallel database as well as spectral library search using our data decomposition programs, X!Tandem and SpectraST.
引用
收藏
页码:5101 / 5108
页数:8
相关论文
共 50 条
  • [1] MIC-Tandem: parallel X!Tandem Using MIC on Tandem Mass Spectrometry Based Proteomics Data
    He, Pinjie
    Li, Kenli
    2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 717 - 720
  • [2] A dynamic wavelet-based algorithm for pre-processing tandem mass spectrometry data
    Wang, Penghao
    Yang, Pengyi
    Arthur, Jonathan
    Yang, Jean Yee Hwa
    BIOINFORMATICS, 2010, 26 (18) : 2242 - 2249
  • [3] Grid-based Analysis of Tandem Mass Spectrometry Data in Clinical Proteomics
    Quandt, Andreas
    Hernandez, Patricia
    Kunzst, Peter
    Pautasso, Cesare
    Tuloup, Marc
    Hernandez, Celine
    Appel, Ron D.
    FROM GENES TO PERSONALIZED HEALTHCARE: GRID SOLUTIONS FOR THE LIFE SCIENCES, 2007, 126 : 13 - +
  • [4] Data Preprocessing and Filtering in Mass Spectrometry Based Proteomics
    Reiz, Beata
    Kertesz-Farkas, Attila
    Pongor, Sandor
    Myers, Michael P.
    CURRENT BIOINFORMATICS, 2012, 7 (02) : 212 - 220
  • [5] The Crux Toolkit for Analysis of Bottom-Up Tandem Mass Spectrometry Proteomics Data
    Kertesz-Farkas, Attila
    Acquaye, Frank Lawrence Nii Adoquaye
    Bhimani, Kishankumar
    Eng, Jimmy K.
    Fondrie, William E.
    Grant, Charles
    Hoopmann, Michael R.
    Lin, Andy
    Lu, Yang Y.
    Moritz, Robert L.
    MacCoss, Michael J.
    Noble, William Stafford
    JOURNAL OF PROTEOME RESEARCH, 2023, 22 (02) : 561 - 569
  • [6] Challenges in Computational Analysis of Mass Spectrometry Data for Proteomics
    Ma, Bin
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (01) : 107 - 123
  • [7] MASS SPECTROMETRY BASED PROTEOMICS
    Antohe, F.
    ACTA ENDOCRINOLOGICA-BUCHAREST, 2015, 11 (02) : 139 - 142
  • [8] Trans-Proteomic Pipeline: Robust Mass Spectrometry-Based Proteomics Data Analysis Suite
    Deutsch, Eric W.
    Mendoza, Luis
    Shteynberg, David D.
    Hoopmann, Michael R.
    Sun, Zhi
    Eng, Jimmy K.
    Moritz, Robert L.
    JOURNAL OF PROTEOME RESEARCH, 2023, : 615 - 624
  • [9] Preview: A Program for Surveying Shotgun Proteomics Tandem Mass Spectrometry Data
    Kil, Yong J.
    Becker, Christopher
    Sandoval, Wendy
    Godberg, David
    Bern, Marshall
    ANALYTICAL CHEMISTRY, 2011, 83 (13) : 5259 - 5267
  • [10] Database Searching in Mass Spectrometry Based Proteomics
    Kertesz-Farkas, Attila
    Reiz, Beata
    Myers, Michael P.
    Pongor, Sandor
    CURRENT BIOINFORMATICS, 2012, 7 (02) : 221 - 230