Human Proteomic Variation Revealed by Combining RNA-Seq Proteogenomics and Global Post-Translational Modification (G-PTM) Search Strategy

被引:23
作者
Cesnik, Anthony J. [1 ]
Shortreed, Michael R. [1 ]
Sheynkman, Gloria M. [1 ]
Frey, Brian L. [1 ]
Smith, Lloyd M. [1 ,2 ]
机构
[1] Univ Wisconsin, Dept Chem, 1101 Univ Ave, Madison, WI 53706 USA
[2] Univ Wisconsin, Genome Ctr Wisconsin, 425G Henry Mall, Madison, WI 53706 USA
关键词
bottom-up proteomics; proteomic database search; cancer cell lines; RNA-Seq; proteogenomics; single amino acid variant (SAV); novel splice junction (NSJ); PTM; G-PTM; FALSE DISCOVERY RATES; SHOTGUN PROTEOMICS; MASS; PEPTIDES; GENOME; IDENTIFICATION; GALAXY;
D O I
10.1021/acs.jproteome.5b00817
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Mass-spectrometry-based proteomic analysis underestimates proteomic variation due to the absence of variant peptides and posttranslational modifications (PTMs) from standard protein databases. Each individual carries thousands of missense mutations that lead to single amino acid variants, but these are missed because they are absent from generic proteomic search databases. Myriad types of protein PTMs play essential roles in biological processes but remain undetected because of increased false discovery rates in variable modification searches. We address these two fundamental shortcomings of bottom-up proteomics with two recently developed software tools. The first consists of workflows in Galaxy that mine RNA sequencing data to generate sample-specific databases containing variant peptides and products of alternative splicing events. The second tool applies a new strategy that alters the variable modification approach to consider only curated PTMs at specific positions, thereby avoiding the combinatorial explosion that traditionally leads to high false discovery rates. Using RNA-sequencing-derived databases with this Global Post-Translational Modification (G-PTM) search strategy revealed hundreds of single amino acid variant peptides, tens of novel splice junction peptides, and several hundred posttranslationally modified peptides in each of ten human cell lines.
引用
收藏
页码:800 / 808
页数:9
相关论文
共 30 条
[1]   A pan-cancer proteomic perspective on The Cancer Genome Atlas [J].
Akbani, Rehan ;
Ng, Patrick Kwok Shing ;
Werner, Henrica M. J. ;
Shahmoradgoli, Maria ;
Zhang, Fan ;
Ju, Zhenlin ;
Liu, Wenbin ;
Yang, Ji-Yeon ;
Yoshihara, Kosuke ;
Li, Jun ;
Ling, Shiyun ;
Seviour, Elena G. ;
Ram, Prahlad T. ;
Minna, John D. ;
Diao, Lixia ;
Tong, Pan ;
Heymach, John V. ;
Hill, Steven M. ;
Dondelinger, Frank ;
Stadler, Nicolas ;
Byers, Lauren A. ;
Meric-Bernstam, Funda ;
Weinstein, John N. ;
Broom, Bradley M. ;
Verhaak, Roeland G. W. ;
Liang, Han ;
Mukherjee, Sach ;
Lu, Yiling ;
Mills, Gordon B. .
NATURE COMMUNICATIONS, 2014, 5
[2]  
Blankenberg Daniel, 2010, Curr Protoc Mol Biol, VChapter 19, DOI 10.1002/0471142727.mb1910s89
[3]   A mass-tolerant database search identifies a large proportion of unassigned spectra in shotgun proteomics as modified peptides [J].
Chick, Joel M. ;
Kolippakkam, Deepak ;
Nusinow, David P. ;
Zhai, Bo ;
Rad, Ramin ;
Huttlin, Edward L. ;
Gygi, Steven P. .
NATURE BIOTECHNOLOGY, 2015, 33 (07) :743-749
[4]   A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3 [J].
Cingolani, Pablo ;
Platts, Adrian ;
Wang, Le Lily ;
Coon, Melissa ;
Tung Nguyen ;
Wang, Luan ;
Land, Susan J. ;
Lu, Xiangyi ;
Ruden, Douglas M. .
FLY, 2012, 6 (02) :80-92
[5]   PROTEOFORMER: deep proteome coverage through ribosome profiling and MS integration [J].
Crappe, Jeroen ;
Ndah, Elvis ;
Koch, Alexander ;
Steyaert, Sandra ;
Gawron, Daria ;
De Keulenaer, Sarah ;
De Meester, Ellen ;
De Meyer, Tim ;
Van Criekinge, Wim ;
Van Damme, Petra ;
Menschaert, Gerben .
NUCLEIC ACIDS RESEARCH, 2015, 43 (05)
[6]   Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry [J].
Elias, Joshua E. ;
Gygi, Steven P. .
NATURE METHODS, 2007, 4 (03) :207-214
[7]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[8]  
Evans VC, 2012, NAT METHODS, V9, P1207, DOI [10.1038/NMETH.2227, 10.1038/nmeth.2227]
[9]   Comparative Proteomic Analysis of Eleven Common Cell Lines Reveals Ubiquitous but Varying Expression of Most Proteins [J].
Geiger, Tamar ;
Wehner, Anja ;
Schaab, Christoph ;
Cox, Juergen ;
Mann, Matthias .
MOLECULAR & CELLULAR PROTEOMICS, 2012, 11 (03)
[10]   Galaxy: A platform for interactive large-scale genome analysis [J].
Giardine, B ;
Riemer, C ;
Hardison, RC ;
Burhans, R ;
Elnitski, L ;
Shah, P ;
Zhang, Y ;
Blankenberg, D ;
Albert, I ;
Taylor, J ;
Miller, W ;
Kent, WJ ;
Nekrutenko, A .
GENOME RESEARCH, 2005, 15 (10) :1451-1455