Prediction of alternative isoforms from exon expression levels in RNA-Seq experiments

被引:107
作者
Richard, Hugues [1 ]
Schulz, Marcel H. [1 ,2 ]
Sultan, Marc [3 ]
Nuernberger, Asja [3 ]
Schrinner, Sabine [3 ]
Balzereit, Daniela [3 ]
Dagand, Emilie [3 ]
Rasche, Axel [3 ]
Lehrach, Hans [3 ]
Vingron, Martin [1 ]
Haas, Stefan A. [1 ]
Yaspo, Marie-Laure [3 ]
机构
[1] Max Planck Inst Mol Genet, Dept Computat Mol Biol, D-14195 Berlin, Germany
[2] Int Max Planck Res Sch Computat Biol & Sci Comp, D-14195 Berlin, Germany
[3] Max Planck Inst Mol Genet, Dept Vertebrate Genom, D-14195 Berlin, Germany
关键词
GENE-EXPRESSION; TRANSCRIPTOME; IDENTIFICATION; ALGORITHM; DISCOVERY; ARRAYS; CELLS;
D O I
10.1093/nar/gkq041
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Alternative splicing, polyadenylation of pre-messenger RNA molecules and differential promoter usage can produce a variety of transcript isoforms whose respective expression levels are regulated in time and space, thus contributing specific biological functions. However, the repertoire of mammalian alternative transcripts and their regulation are still poorly understood. Second-generation sequencing is now opening unprecedented routes to address the analysis of entire transcriptomes. Here, we developed methods that allow the prediction and quantification of alternative isoforms derived solely from exon expression levels in RNA-Seq data. These are based on an explicit statistical model and enable the prediction of alternative isoforms within or between conditions using any known gene annotation, as well as the relative quantification of known transcript structures. Applying these methods to a human RNA-Seq dataset, we validated a significant fraction of the predictions by RT-PCR. Data further showed that these predictions correlated well with information originating from junction reads. A direct comparison with exon arrays indicated improved performances of RNA-Seq over microarrays in the prediction of skipped exons. Altogether, the set of methods presented here comprehensively addresses multiple aspects of alternative isoform analysis. The software is available as an open-source R-package called Solas at http://cmb.molgen.mpg.de/2ndGenerationSequencing/Solas/.
引用
收藏
页数:15
相关论文
共 53 条
  • [1] *AFF, 2005, AFF WHIT PAP ALT TRA
  • [2] [Anonymous], 2013, Regression Analysis of Count Data
  • [3] SPACE: an algorithm to predict and quantify alternatively spliced isoforms using microarrays
    Anton, Miguel A.
    Gorostiaga, Dorleta
    Guruceaga, Elizabeth
    Segura, Victor
    Carmona-Saez, Pedro
    Pascual-Montano, Alberto
    Pio, Ruben
    Montuenga, Luis M.
    Rubio, Angel
    [J]. GENOME BIOLOGY, 2008, 9 (02)
  • [4] The significance of digital gene expression profiles
    Audic, S
    Claverie, JM
    [J]. GENOME RESEARCH, 1997, 7 (10): : 986 - 995
  • [5] Statistical modeling of sequencing errors in SAGE libraries
    Beissbarth, Tim
    Hyde, Lavinia
    Smyth, Gordon K.
    Job, Chris
    Boon, Wee-Ming
    Tan, Seong-Seng
    Scott, Hamish S.
    Speed, Terence P.
    [J]. BIOINFORMATICS, 2004, 20 : 31 - 39
  • [6] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [7] Alternative splicing: New insights from global analyses
    Blencowe, Benjamin J.
    [J]. CELL, 2006, 126 (01) : 37 - 47
  • [8] Alternative splicing and genome complexity
    Brett, D
    Pospisil, H
    Valcárcel, J
    Reich, J
    Bork, P
    [J]. NATURE GENETICS, 2002, 30 (01) : 29 - 30
  • [9] Genomewide analysis of mRNA processing in yeast using splicing-specific microarrays
    Clark, TA
    Sugnet, CW
    Ares, M
    [J]. SCIENCE, 2002, 296 (5569) : 907 - 910
  • [10] Discovery of tissue-specific exons using comprehensive human exon microarrays
    Clark, Tyson A.
    Schweitzer, Anthony C.
    Chen, Tina X.
    Staples, Michelle K.
    Lu, Gang
    Wang, Hui
    Williams, Alan
    Blume, John E.
    [J]. GENOME BIOLOGY, 2007, 8 (04)