variancePartition: interpreting drivers of variation in complex gene expression studies

被引:427
作者
Hoffman, Gabriel E. [1 ]
Schadt, Eric E. [1 ]
机构
[1] Icahn Sch Med Mt Sinai, Icahn Inst Genom & Multiscale Biol, Dept Genet & Genom Sci, New York, NY 10029 USA
关键词
Transcriptome profiling; RNA-seq; Linear mixed model; GENOME-WIDE; MIXED-MODEL; QUANTIFICATION; TRANSCRIPTOME; VARIANCE; READS;
D O I
10.1186/s12859-016-1323-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: As large-scale studies of gene expression with multiple sources of biological and technical variation become widely adopted, characterizing these drivers of variation becomes essential to understanding disease biology and regulatory genetics. Results: We describe a statistical and visualization framework, variancePartition, to prioritize drivers of variation based on a genome-wide summary, and identify genes that deviate from the genome-wide trend. Using a linear mixed model, variancePartition quantifies variation in each expression trait attributable to differences in disease status, sex, cell or tissue type, ancestry, genetic background, experimental stimulus, or technical variables. Analysis of four large-scale transcriptome profiling datasets illustrates that variancePartition recovers striking patterns of biological and technical variation that are reproducible across multiple datasets. Conclusions: Our open source software, variancePartition, enables rapid interpretation of complex gene expression studies as well as other high-throughput genomics assays. variancePartition is available from Bioconductor: http://bioconductor.org/packages/variancePartition.
引用
收藏
页数:13
相关论文
共 52 条
[1]   Singular value decomposition for genome-wide expression data processing and modeling [J].
Alter, O ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :10101-10106
[2]   HTSeq-a Python']Python framework to work with high-throughput sequencing data [J].
Anders, Simon ;
Pyl, Paul Theodor ;
Huber, Wolfgang .
BIOINFORMATICS, 2015, 31 (02) :166-169
[3]  
[Anonymous], 2015, REVOLUTION ANAL WEST
[4]   The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans [J].
Ardlie, Kristin G. ;
DeLuca, David S. ;
Segre, Ayellet V. ;
Sullivan, Timothy J. ;
Young, Taylor R. ;
Gelfand, Ellen T. ;
Trowbridge, Casandra A. ;
Maller, Julian B. ;
Tukiainen, Taru ;
Lek, Monkol ;
Ward, Lucas D. ;
Kheradpour, Pouya ;
Iriarte, Benjamin ;
Meng, Yan ;
Palmer, Cameron D. ;
Esko, Tonu ;
Winckler, Wendy ;
Hirschhorn, Joel N. ;
Kellis, Manolis ;
MacArthur, Daniel G. ;
Getz, Gad ;
Shabalin, Andrey A. ;
Li, Gen ;
Zhou, Yi-Hui ;
Nobel, Andrew B. ;
Rusyn, Ivan ;
Wright, Fred A. ;
Lappalainen, Tuuli ;
Ferreira, Pedro G. ;
Ongen, Halit ;
Rivas, Manuel A. ;
Battle, Alexis ;
Mostafavi, Sara ;
Monlong, Jean ;
Sammeth, Michael ;
Mele, Marta ;
Reverter, Ferran ;
Goldmann, Jakob M. ;
Koller, Daphne ;
Guigo, Roderic ;
McCarthy, Mark I. ;
Dermitzakis, Emmanouil T. ;
Gamazon, Eric R. ;
Im, Hae Kyung ;
Konkashbaev, Anuar ;
Nicolae, Dan L. ;
Cox, Nancy J. ;
Flutre, Timothee ;
Wen, Xiaoquan ;
Stephens, Matthew .
SCIENCE, 2015, 348 (6235) :648-660
[5]   Fitting Linear Mixed-Effects Models Using lme4 [J].
Bates, Douglas ;
Maechler, Martin ;
Bolker, Benjamin M. ;
Walker, Steven C. .
JOURNAL OF STATISTICAL SOFTWARE, 2015, 67 (01) :1-48
[6]   Near-optimal probabilistic RNA-seq quantification (vol 34, pg 525, 2016) [J].
Bray, Nicolas L. ;
Pimentel, Harold ;
Melsted, Pall ;
Pachter, Lior .
NATURE BIOTECHNOLOGY, 2016, 34 (08) :888-888
[7]   Genetic Variation, Not Cell Type of Origin, Underlies the Majority of Identifiable Regulatory Differences in iPSCs [J].
Burrows, Courtney K. ;
Banovich, Nicholas E. ;
Pavlovic, Bryan J. ;
Patterson, Kristen ;
Romero, Irene Gallego ;
Pritchard, Jonathan K. ;
Gilad, Yoav .
PLOS GENETICS, 2016, 12 (01)
[8]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[9]   Innate Immune Activity Conditions the Effect of Regulatory Variants upon Monocyte Gene Expression [J].
Fairfax, Benjamin P. ;
Humburg, Peter ;
Makino, Seiko ;
Naranbhai, Vivek ;
Wong, Daniel ;
Lau, Evelyn ;
Jostins, Luke ;
Plant, Katharine ;
Andrews, Robert ;
Mcgee, Chris ;
Knight, Julian C. .
SCIENCE, 2014, 343 (6175) :1118-+
[10]   mRIN for direct assessment of genome-wide and gene-specific mRNA integrity from large-scale RNA-sequencing data [J].
Feng, Huijuan ;
Zhang, Xuegong ;
Zhang, Chaolin .
NATURE COMMUNICATIONS, 2015, 6