MIXnorm: normalizing RNA-seq data from formalin-fixed paraffin-embedded samples

被引:3
作者
Yin, Shen [1 ,2 ]
Wang, Xinlei [1 ]
Jia, Gaoxiang [1 ]
Xie, Yang [2 ]
机构
[1] Southern Methodist Univ, Dept Stat Sci, Dallas, TX 75275 USA
[2] Univ Texas Southwestern Med Ctr Dallas, Quantitat Biomed Res Ctr, Dept Populat & Data Sci, Dallas, TX 75390 USA
基金
美国国家卫生研究院;
关键词
DOPAMINE TRANSPORTER; CANCER; IDENTIFICATION; SLC6A3;
D O I
10.1093/bioinformatics/btaa153
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Recent studies have shown that RNA-sequencing (RNA-seq) can be used to measure mRNA of sufficient quality extracted from formalin-fixed paraffin-embedded (FFPE) tissues to provide whole-genome transcriptome analysis. However, little attention has been given to the normalization of FFPE RNA-seq data, a key step that adjusts for unwanted biological and technical effects that can bias the signal of interest. Existing methods, developed based on fresh-frozen or similar-type samples, may cause suboptimal performance. Results: We proposed a new normalization method, labeled MIXnorm, for FFPE RNA-seq data. MIXnorm relies on a two-component mixture model, which models non-expressed genes by zero-inflated Poisson distributions and models expressed genes by truncated normal distributions. To obtain maximum likelihood estimates, we developed a nested EM algorithm, in which closed-form updates are available in each iteration. By eliminating the need for numerical optimization in the M-step, the algorithm is easy to implement and computationally efficient. We evaluated MIXnorm through simulations and cancer studies. MIXnorm makes a significant improvement over commonly used methods for RNA-seq expression data.
引用
收藏
页码:3401 / 3408
页数:8
相关论文
共 27 条
[1]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[2]   Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments [J].
Bullard, James H. ;
Purdom, Elizabeth ;
Hansen, Kasper D. ;
Dudoit, Sandrine .
BMC BIOINFORMATICS, 2010, 11
[3]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[4]   A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis [J].
Dillies, Marie-Agnes ;
Rau, Andrea ;
Aubert, Julie ;
Hennequet-Antier, Christelle ;
Jeanmougin, Marine ;
Servant, Nicolas ;
Keime, Celine ;
Marot, Guillemette ;
Castel, David ;
Estelle, Jordi ;
Guernec, Gregory ;
Jagla, Bernd ;
Jouneau, Luc ;
Laloe, Denis ;
Le Gall, Caroline ;
Schaeffer, Brigitte ;
Le Crom, Stephane ;
Guedj, Mickael ;
Jaffrezic, Florence .
BRIEFINGS IN BIOINFORMATICS, 2013, 14 (06) :671-683
[5]   Transcriptome Sequencing (RNAseq) Enables Utilization of Formalin-Fixed, Paraffin-Embedded Biopsies with Clear Cell Renal Cell Carcinoma for Exploration of Disease Biology and Biomarker Development [J].
Eikrem, Oystein ;
Beisland, Christian ;
Hjeiie, Karin ;
Flatberg, Arnar ;
Scherer, Andreas ;
Landolt, Lea ;
Skogstrand, Trude ;
Leh, Sabine ;
Beisvag, Vidar ;
Marti, Hans-Peter .
PLOS ONE, 2016, 11 (02)
[6]   Robust gene expression and mutation analyses of RNA-sequencing of formalin-fixed diagnostic tumor samples [J].
Graw, Stefan ;
Meier, Richard ;
Minn, Kay ;
Bloomer, Clark ;
Godwin, Andrew K. ;
Fridley, Brooke ;
Vlad, Anda ;
Beyerlein, Peter ;
Chien, Jeremy .
SCIENTIFIC REPORTS, 2015, 5
[7]   RNA-seq transcriptome analysis of formalin fixed, paraffin-embedded canine meningioma [J].
Grenier, Jennifer K. ;
Foureman, Polly A. ;
Sloma, Erica A. ;
Miller, Andrew D. .
PLOS ONE, 2017, 12 (10)
[8]   Overexpression of Functional SLC6A3 in Clear Cell Renal Cell Carcinoma [J].
Hansson, Jennifer ;
Lindgren, David ;
Nilsson, Helen ;
Johansson, Elinn ;
Johansson, Martin ;
Gustavsson, Lena ;
Axelson, Hakan .
CLINICAL CANCER RESEARCH, 2017, 23 (08) :2105-2115
[9]   ZERO-INFLATED POISSON REGRESSION, WITH AN APPLICATION TO DEFECTS IN MANUFACTURING [J].
LAMBERT, D .
TECHNOMETRICS, 1992, 34 (01) :1-14
[10]   RNA sequencing validation of the Complexity INdex in SARComas prognostic signature [J].
Lesluyes, Tom ;
Perot, Gaelle ;
Largeau, Marine Roxane ;
Brulard, Celine ;
Lagarde, Pauline ;
Dapremont, Valerie ;
Lucchesi, Carlo ;
Neuville, Agnes ;
Terrier, Philippe ;
Vince-Ranchere, Dominique ;
Mendez-Lago, Maria ;
Gut, Marta ;
Gut, Ivo ;
Coindre, Jean-Michel ;
Chibon, Frederic .
EUROPEAN JOURNAL OF CANCER, 2016, 57 :104-111