Isoform-Level Interpretation of High-Throughput Proteomics Data Enabled by Deep Integration with RNA-seq

被引:18
作者
Carlyle, Becky C. [1 ]
Kitchen, Robert R. [1 ,2 ]
Zhang, Jing [2 ]
Wilson, Rashaun S. [3 ]
Lam, Tukiet T. [2 ,3 ,4 ]
Rozowsky, Joel S. [2 ]
Williams, Kenneth R. [2 ,3 ]
Sestan, Nenad [5 ,6 ,7 ,8 ]
Gerstein, Mark B. [2 ]
Nairn, Angus C. [1 ]
机构
[1] Yale Sch Med, Dept Psychiat, Connecticut Mental Hlth Ctr, 34 Pk St, New Haven, CT 06519 USA
[2] Yale Sch Med, Dept Mol Biophys & Biochem, POB 208114, New Haven, CT 06520 USA
[3] Yale Sch Med, Yale NIDA Neuroprote Ctr, 300 George St, New Haven, CT 06510 USA
[4] Yale Sch Med, WM Keck Biotechnol Resource Lab, 300 George St, New Haven, CT 06510 USA
[5] Yale Sch Med, Dept Neurosci, New Haven, CT 06510 USA
[6] Yale Sch Med, Kavli Inst Neurosci, Comparat Med Sect, Dept Genet, New Haven, CT 06510 USA
[7] Yale Sch Med, Kavli Inst Neurosci, Comparat Med Sect, Dept Psychiat, New Haven, CT 06510 USA
[8] Yale Sch Med, Yale Child Study Ctr, Program Cellular Neurosci Neurodegenerat & Repair, New Haven, CT 06510 USA
关键词
RNA-seq; ribosome profiling; mass spectrometry; peptides; isoforms; proteogenomics; HEK293; brain; expectation maximization; integrative analysis; TANDEM MASS-SPECTRA; SEQUENCING EXPERIMENTS; TRANSCRIPT EXPRESSION; ALZHEIMERS-DISEASE; TOP-DOWN; SPECTROMETRY; TRANSLATION; PROTEIN; QUANTIFICATION; IDENTIFICATION;
D O I
10.1021/acs.jproteome.8b00310
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Cellular control of gene expression is a complex process that is subject to multiple levels of regulation, but ultimately it is the protein produced that determines the biosynthetic state of the cell. One way that a cell can regulate the protein output from each gene is by expressing alternate isoforms with distinct amino acid sequences. These isoforms may exhibit differences in localization and binding interactions that can have profound functional implications. High throughput liquid chromatography tandem mass spectrometry proteomics (LC-MS/MS) relies on enzymatic digestion and has lower coverage and sensitivity than transcriptomic profiling methods such as RNA-seq. Digestion results in predictable fragmentation of a protein, which can limit the generation of peptides capable of distinguishing between isoforms. Here we exploit transcript-level expression from RNA-seq to set prior likelihoods and enable protein isoform abundances to be directly estimated from LC-MS/MS, an approach derived from the principle that most genes appear to be expressed as a single dominant isoform in a given cell type or tissue. Through this deep integration of RNA-seq and LC-MS/MS data from the same sample, we show that a principal isoform can be identified in >80% of gene products in homogeneous HEK293 cell culture and >70% of proteins detected in complex human brain tissue. We demonstrate that the incorporation of translatome data from ribosome profiling further refines this process. Defining isoforms in experiments with matched RNA-seq/translatome and proteomic data increases the functional relevance of such data sets and will further broaden our understanding of multilevel control of gene expression.
引用
收藏
页码:3431 / 3444
页数:14
相关论文
共 52 条
  • [1] Differential Mass Spectrometry Profiles of Tau Protein in the Cerebrospinal Fluid of Patients with Alzheimer's Disease, Progressive Supranuclear Palsy, and Dementia with Lewy Bodies
    Barthelemy, Nicolas R.
    Gabelle, Audrey
    Hirtz, Christophe
    Fenaille, Francois
    Sergeant, Nicolas
    Schraen-Maschke, Susanna
    Vialaret, Jerome
    Buee, Luc
    Junot, Christophe
    Becher, Francois
    Lehmann, Sylvain
    [J]. JOURNAL OF ALZHEIMERS DISEASE, 2016, 51 (04) : 1033 - 1043
  • [2] Transcriptional Architecture of the Primate Neocortex
    Bernard, Amy
    Lubbers, Laura S.
    Tanis, Keith Q.
    Luo, Rui
    Podtelezhnikov, Alexei A.
    Finney, Eva M.
    McWhorter, Mollie M. E.
    Serikawa, Kyle
    Lemon, Tracy
    Morgan, Rebecca
    Copeland, Catherine
    Smith, Kimberly
    Cullen, Vivian
    Davis-Turak, Jeremy
    Lee, Chang-Kyu
    Sunkin, Susan M.
    Loboda, Andrey P.
    Levine, David M.
    Stone, David J.
    Hawrylycz, Michael J.
    Roberts, Christopher J.
    Jones, Allan R.
    Geschwind, Daniel H.
    Lein, Ed S.
    [J]. NEURON, 2012, 73 (06) : 1083 - 1099
  • [3] Calviello L, 2016, NAT METHODS, V13, P165, DOI [10.1038/NMETH.3688, 10.1038/nmeth.3688]
  • [4] On-line expectation-maximization algorithm for latent data models
    Cappe, Olivier
    Moulines, Eric
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2009, 71 : 593 - 613
  • [5] A multiregional proteomic survey of the postnatal human brain
    Carlyle, Becky C.
    Kitchen, Robert R.
    Kanyo, Jean E.
    Voss, Edward Z.
    Pletikos, Mihovil
    Sousa, Andre M. M.
    Lam, TuKiet T.
    Gerstein, Mark B.
    Sestan, Nenad
    Nairn, Angus C.
    [J]. NATURE NEUROSCIENCE, 2017, 20 (12) : 1787 - +
  • [6] MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification
    Cox, Juergen
    Mann, Matthias
    [J]. NATURE BIOTECHNOLOGY, 2008, 26 (12) : 1367 - 1372
  • [7] TANDEM: matching proteins with tandem mass spectra
    Craig, R
    Beavis, RC
    [J]. BIOINFORMATICS, 2004, 20 (09) : 1466 - 1467
  • [8] Landscape of transcription in human cells
    Djebali, Sarah
    Davis, Carrie A.
    Merkel, Angelika
    Dobin, Alex
    Lassmann, Timo
    Mortazavi, Ali
    Tanzer, Andrea
    Lagarde, Julien
    Lin, Wei
    Schlesinger, Felix
    Xue, Chenghai
    Marinov, Georgi K.
    Khatun, Jainab
    Williams, Brian A.
    Zaleski, Chris
    Rozowsky, Joel
    Roeder, Maik
    Kokocinski, Felix
    Abdelhamid, Rehab F.
    Alioto, Tyler
    Antoshechkin, Igor
    Baer, Michael T.
    Bar, Nadav S.
    Batut, Philippe
    Bell, Kimberly
    Bell, Ian
    Chakrabortty, Sudipto
    Chen, Xian
    Chrast, Jacqueline
    Curado, Joao
    Derrien, Thomas
    Drenkow, Jorg
    Dumais, Erica
    Dumais, Jacqueline
    Duttagupta, Radha
    Falconnet, Emilie
    Fastuca, Meagan
    Fejes-Toth, Kata
    Ferreira, Pedro
    Foissac, Sylvain
    Fullwood, Melissa J.
    Gao, Hui
    Gonzalez, David
    Gordon, Assaf
    Gunawardena, Harsha
    Howald, Cedric
    Jha, Sonali
    Johnson, Rory
    Kapranov, Philipp
    King, Brandon
    [J]. NATURE, 2012, 489 (7414) : 101 - 108
  • [9] Multi-omic data integration enables discovery of hidden biological regularities
    Ebrahim, Ali
    Brunk, Elizabeth
    Tan, Justin
    O'Brien, Edward J.
    Kim, Donghyuk
    Szubin, Richard
    Lerman, Joshua A.
    Lechner, Anna
    Sastry, Anand
    Bordbar, Aarash
    Feist, Adam M.
    Palsson, Bernhard O.
    [J]. NATURE COMMUNICATIONS, 2016, 7
  • [10] Tunable protein synthesis by transcript isoforms in human cells
    Floor, Stephen N.
    Doudna, Jennifer A.
    [J]. ELIFE, 2016, 5