Guide for protein fold change and p-value calculation for non-experts in proteomics

被引:110
作者
Aguilan, Jennifer T. [1 ,2 ]
Kulej, Katarzyna [3 ,4 ]
Sidoli, Simone [1 ,5 ]
机构
[1] Albert Einstein Coll Med, Lab Macromol Anal & Prote Facil, Bronx, NY 10461 USA
[2] Albert Einstein Coll Med, Dept Pathol, Bronx, NY 10461 USA
[3] Childrens Hosp Philadelphia, Div Protect Immun, Philadelphia, PA 19104 USA
[4] Childrens Hosp Philadelphia, Div Canc Pathobiol, Philadelphia, PA 19104 USA
[5] Albert Einstein Coll Med, Dept Biochem, Bronx, NY 10461 USA
关键词
QUANTITATIVE MASS-SPECTROMETRY; LIQUID-CHROMATOGRAPHY; STATISTICAL-ANALYSIS; LABEL-FREE; PEPTIDE IDENTIFICATION; COMPREHENSIVE ANALYSIS; ABSOLUTE PROTEIN; SOFTWARE; TANDEM; QUANTIFICATION;
D O I
10.1039/d0mo00087f
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Proteomics studies generate tables with thousands of entries. A significant component of being a proteomics scientist is the ability to process these tables to identify regulated proteins. Many bioinformatics tools are freely available for the community, some of which within reach for scientists with limited or no background in programming and statistics. However, proteomics has become popular in most other biological and biomedical disciplines, resulting in more and more studies where data processing is delegated to specialists that are not lead authors of the scientific project. This creates a risk or at least a limiting factor, as the biological interpretation of a dataset is contingent of a third-party specialist transforming data without the input of the project leader. We acknowledge in advance that dedicated scripts and software have a higher level of sophistication; but we hereby claim that the approach we describe makes proteomics data processing immediately accessible to every scientist. In this paper, we describe key steps of the typical data transformation, normalization and statistics in proteomics data analysis using a simple spreadsheet. This manuscript aims to demonstrate to those who are not familiar with the math and statistics behind these workflows that a proteomics dataset can be processed, simplified and interpreted in software like Microsoft Excel. With this, we aim to reach the community of non-specialists in proteomics to find a common language and illustrate the basic steps of -omics data processing.
引用
收藏
页码:573 / 582
页数:10
相关论文
共 60 条
[11]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467
[12]   A guided tour of the Trans-Proteomic Pipeline [J].
Deutsch, Eric W. ;
Mendoza, Luis ;
Shteynberg, David ;
Farrah, Terry ;
Lam, Henry ;
Tasman, Natalie ;
Sun, Zhi ;
Nilsson, Erik ;
Pratt, Brian ;
Prazen, Bryan ;
Eng, Jimmy K. ;
Martin, Daniel B. ;
Nesvizhskii, Alexey I. ;
Aebersold, Ruedi .
PROTEOMICS, 2010, 10 (06) :1150-1159
[13]   MS Amanda, a Universal Identification Algorithm Optimized for High Accuracy Tandem Mass Spectra [J].
Dorfer, Viktoria ;
Pichler, Peter ;
Stranzl, Thomas ;
Stadlmann, Johannes ;
Taus, Thomas ;
Winkler, Stephan ;
Mechtler, Karl .
JOURNAL OF PROTEOME RESEARCH, 2014, 13 (08) :3679-3684
[14]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[15]   A Critical Appraisal of Techniques, Software Packages, and Standards for Quantitative Proteomic Analysis [J].
Gonzalez-Galarza, Faviel F. ;
Lawless, Craig ;
Hubbard, Simon J. ;
Fan, Jun ;
Bessant, Conrad ;
Hermjakob, Henning ;
Jones, Andrew R. .
OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2012, 16 (09) :431-442
[16]   Label-free, normalized quantification of complex mass spectrometry data for proteomic analysis [J].
Griffin, Noelle M. ;
Yu, Jingyi ;
Long, Fred ;
Oh, Phil ;
Shore, Sabrina ;
Li, Yan ;
Koziol, Jim A. ;
Schnitzer, Jan E. .
NATURE BIOTECHNOLOGY, 2010, 28 (01) :83-U116
[17]   Quantitative analysis of complex protein mixtures using isotope-coded affinity tags [J].
Gygi, SP ;
Rist, B ;
Gerber, SA ;
Turecek, F ;
Gelb, MH ;
Aebersold, R .
NATURE BIOTECHNOLOGY, 1999, 17 (10) :994-999
[18]   Multi-omics approaches to disease [J].
Hasin, Yehudit ;
Seldin, Marcus ;
Lusis, Aldons .
GENOME BIOLOGY, 2017, 18
[19]   Cell-specific proteome analyses of human bone marrow reveal molecular features of age-dependent functional decline [J].
Hennrich, Marco L. ;
Romanov, Natalie ;
Horn, Patrick ;
Jaeger, Samira ;
Eckstein, Volker ;
Steeples, Violetta ;
Ye, Fei ;
Ding, Ximing ;
Poisa-Beiro, Laura ;
Lai, Mang Ching ;
Lang, Benjamin ;
Boultwood, Jacqueline ;
Luft, Thomas ;
Zaugg, Judith B. ;
Pellagatti, Andrea ;
Bork, Peer ;
Aloy, Patrick ;
Gavin, Anne-Claude ;
Ho, Anthony D. .
NATURE COMMUNICATIONS, 2018, 9
[20]   Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein [J].
Ishihama, Y ;
Oda, Y ;
Tabata, T ;
Sato, T ;
Nagasu, T ;
Rappsilber, J ;
Mann, M .
MOLECULAR & CELLULAR PROTEOMICS, 2005, 4 (09) :1265-1272