Alternating EM algorithm for a bilinear model in isoform quantification from RNA-seq data

被引:13
|
作者
Deng, Wenjiang [1 ]
Mou, Tian [1 ]
Kalari, Krishna R. [2 ]
Niu, Nifang [3 ]
Wang, Liewei [3 ]
Pawitan, Yudi [1 ]
Trung Nghia Vu [1 ]
机构
[1] Karolinska Inst, Dept Med Epidemiol & Biostat, S-17177 Stockholm, Sweden
[2] Mayo Clin, Dept Hlth Sci Res, Rochester, MN 55905 USA
[3] Mayo Clin, Dept Mol Pharmacol & Expt Therapeut, Rochester, MN 55905 USA
基金
瑞典研究理事会;
关键词
EXPRESSION; ALIGNMENT; KINASE; READS;
D O I
10.1093/bioinformatics/btz640
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Estimation of isoform-level gene expression from RNA-seq data depends on simplifying assumptions, such as uniform read distribution, that are easily violated in real data. Such violations typically lead to biased estimates. Most existing methods provide bias correction step(s), which is based on biological considerations-such as GC content-and applied in single samples separately. The main problem is that not all biases are known. Results: We have developed a novel method called XAEM based on a more flexible and robust statistical model. Existing methods are essentially based on a linear model X beta, where the design matrix X is known and is computed based on the simplifying assumptions. In contrast XAEM considers X beta as a bilinear model with both X and beta unknown. Joint estimation of X and beta is made possible by a simultaneous analysis of multi-sample RNA-seq data. Compared to existing methods, XAEM automatically performs empirical correction of potentially unknown biases. We use an alternating expectation-maximization (AEM) algorithm, alternating between estimation of X and beta. For speed XAEM utilizes quasi-mapping for read alignment, thus leading to a fast algorithm. Overall XAEM performs favorably compared to recent advanced methods. For simulated datasets, XAEM obtains higher accuracy for multiple-isoform genes. In a differential-expression analysis of a real single-cell RNA-seq dataset, XAEM achieves substantially better rediscovery rates in independent validation sets.
引用
收藏
页码:805 / 812
页数:8
相关论文
共 50 条
  • [41] Extracting novel hypotheses and findings from RNA-seq data
    Doughty, Tyler
    Kerkhoven, Eduard
    FEMS YEAST RESEARCH, 2020, 20 (02)
  • [42] CADBURE: A generic tool to evaluate the performance of spliced aligners on RNA-Seq data
    Kumar, Praveen Kumar Raj
    Hoang, Thanh V.
    Robinson, Michael L.
    Tsonis, Panagiotis A.
    Liang, Chun
    SCIENTIFIC REPORTS, 2015, 5
  • [43] Detection of generic differential RNA processing events from RNA-seq data
    Tran, Van Du T.
    Souiai, Oussema
    Romero-Barrios, Natali
    Crespi, Martin
    Gautheret, Daniel
    RNA BIOLOGY, 2016, 13 (01) : 59 - 67
  • [44] Overview of available methods for diverse RNA-Seq data analyses
    Chen Geng
    Wang, Charles
    Shi TieLiu
    SCIENCE CHINA-LIFE SCIENCES, 2011, 54 (12) : 1121 - 1128
  • [45] Mining RNA-Seq Data for Infections and Contaminations
    Bonfert, Thomas
    Csaba, Gergely
    Zimmer, Ralf
    Friedel, Caroline C.
    PLOS ONE, 2013, 8 (09):
  • [46] Identification of gene signatures from RNA-seq data using Pareto-optimal cluster algorithm
    Mallik, Saurav
    Zhao, Zhongming
    BMC SYSTEMS BIOLOGY, 2018, 12
  • [47] Union Exon Based Approach for RNA-Seq Gene Quantification: To Be or Not to Be?
    Zhao, Shanrong
    Xi, Li
    Zhang, Baohong
    PLOS ONE, 2015, 10 (11):
  • [48] Computational methods for transcriptome annotation and quantification using RNA-seq
    Garber, Manuel
    Grabherr, Manfred G.
    Guttman, Mitchell
    Trapnell, Cole
    NATURE METHODS, 2011, 8 (06) : 469 - 477
  • [49] Detecting, Categorizing, and Correcting Coverage Anomalies of RNA-Seq Quantification
    Ma, Cong
    Kingsford, Carl
    CELL SYSTEMS, 2019, 9 (06) : 589 - +
  • [50] RNA-Seq Data Analysis: A Practical Guide for Model and Non-Model Organisms
    Pola-Sanchez, Enrique
    Hernandez-Martinez, Karen Magdalena
    Perez-Estrada, Rafael
    Selem-Mojica, Nelly
    Simpson, June
    Abraham-Juarez, Maria Jazmin
    Herrera-Estrella, Alfredo
    Villalobos-Escobedo, Jose Manuel
    CURRENT PROTOCOLS, 2024, 4 (05):