Efficient RNA isoform identification and quantification from RNA-Seq data with network flows

被引:49
|
作者
Bernard, Elsa [1 ,2 ,3 ]
Jacob, Laurent [4 ]
Mairal, Julien [5 ]
Vert, Jean-Philippe [1 ,2 ,3 ]
机构
[1] Mines ParisTech, Ctr Computat Biol, F-77300 Fontainebleau, France
[2] Inst Curie, F-75248 Paris, France
[3] INSERM, U900, F-75248 Paris, France
[4] Univ Lyon 1, INRA, CNRS, Lab Biometrie & Biol Evolut,UMR5558, Villeurbanne, France
[5] INRIA Grenoble Rhone Alpes, LEAR Project Team, F-38330 Montbonnot St Martin, France
基金
欧洲研究理事会; 美国国家科学基金会;
关键词
ABUNDANCE ESTIMATION; TRANSCRIPTOME; EXPRESSION; SELECTION; ALGORITHM; DISCOVERY; GENOME; GRAPHS; LASSO;
D O I
10.1093/bioinformatics/btu317
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Several state-of-the-art methods for isoform identification and quantification are based on l(1)-regularized regression, such as the Lasso. However, explicitly listing the-possibly exponentially-large set of candidate transcripts is intractable for genes with many exons. For this reason, existing approaches using the l(1)-penalty are either restricted to genes with few exons or only run the regression algorithm on a small set of preselected isoforms. Results: We introduce a new technique called FlipFlop, which can efficiently tackle the sparse estimation problem on the full set of candidate isoforms by using network flow optimization. Our technique removes the need of a preselection step, leading to better isoform identification while keeping a low computational cost. Experiments with synthetic and real RNA-Seq data confirm that our approach is more accurate than alternative methods and one of the fastest available.
引用
收藏
页码:2447 / 2455
页数:9
相关论文
共 50 条
  • [31] RNA-seq: from technology to biology
    Marguerat, Samuel
    Baehler, Juerg
    CELLULAR AND MOLECULAR LIFE SCIENCES, 2010, 67 (04) : 569 - 579
  • [32] Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation
    Trapnell, Cole
    Williams, Brian A.
    Pertea, Geo
    Mortazavi, Ali
    Kwan, Gordon
    van Baren, Marijke J.
    Salzberg, Steven L.
    Wold, Barbara J.
    Pachter, Lior
    NATURE BIOTECHNOLOGY, 2010, 28 (05) : 511 - U174
  • [33] RNA-seq: from technology to biology
    Samuel Marguerat
    Jürg Bähler
    Cellular and Molecular Life Sciences, 2010, 67 : 569 - 579
  • [34] Computational analysis of alternative polyadenylation from standard RNA-seq and single-cell RNA-seq data
    Gao, Yipeng
    Li, Wei
    MRNA 3' END PROCESSING AND METABOLISM, 2021, 655 : 225 - 243
  • [35] Automated identification of reference genes based on RNA-seq data
    Carmona, Rosario
    Arroyo, Macarena
    Jose Jimenez-Quesada, Maria
    Seoane, Pedro
    Zafra, Adoracion
    Larrosa, Rafael
    de Dios Alche, Juan
    Gonzalo Claros, M.
    BIOMEDICAL ENGINEERING ONLINE, 2017, 16
  • [36] VIPER: Visualization Pipeline for RNA-seq, a Snakemake workflow for efficient and complete RNA-seq analysis
    Cornwell, MacIntosh
    Vangala, Mahesh
    Taing, Len
    Herbert, Zachary
    Koester, Johannes
    Li, Bo
    Sun, Hanfei
    Li, Taiwen
    Zhang, Jian
    Qiu, Xintao
    Pun, Matthew
    Jeselsohn, Rinath
    Brown, Myles
    Liu, X. Shirley
    Long, Henry W.
    BMC BIOINFORMATICS, 2018, 19
  • [37] Identification of hub glycogenes and their nsSNP analysis from mouse RNA-Seq data
    Firoz, Ahmad
    Malik, Adeel
    Singh, Sanjay Kumar
    Jha, Vivekanand
    Ali, Amjad
    GENE, 2015, 574 (02) : 235 - 246
  • [38] Prostate Cancer Gene Regulatory Network inferred from RNA-Seq Data
    Moore, Daniel
    Simoes, Ricardo de Matos
    Dehmer, Matthias
    Emmert-Streib, Frank
    CURRENT GENOMICS, 2019, 20 (01) : 38 - 48
  • [39] Detecting, Categorizing, and Correcting Coverage Anomalies of RNA-Seq Quantification
    Ma, Cong
    Kingsford, Carl
    CELL SYSTEMS, 2019, 9 (06) : 589 - +
  • [40] Statistical Modeling of RNA-Seq Data
    Salzman, Julia
    Jiang, Hui
    Wong, Wing Hung
    STATISTICAL SCIENCE, 2011, 26 (01) : 62 - 83