Kimma: flexible linear mixed effects modeling with kinship covariance for RNA-seq data

被引:14
作者
Dill-McFarland, Kimberly A. [1 ]
Mitchell, Kiana [1 ,2 ]
Batchu, Sashank [1 ]
Segnitz, Richard Max [1 ]
Benson, Basilin [3 ]
Janczyk, Tomasz [3 ]
Cox, Madison S. [1 ]
Mayanja-Kizza, Harriet [4 ]
Boom, William Henry [5 ]
Benchek, Penelope [6 ]
Stein, Catherine M. [6 ]
Hawn, Thomas R. [1 ]
Altman, Matthew C. [1 ,3 ,7 ]
机构
[1] Univ Washington, Dept Med, Div Allergy & Infect Dis, 750 Republican St, Seattle, WA 98109 USA
[2] Univ Calif San Diego, Dept Biol, 9500 Gilman Dr, La Jolla, CA 92093 USA
[3] Benaroya Res Inst, Syst Immunol Div, 1201 Ninth Ave, Seattle, CA 98101 USA
[4] Makerere Univ, Sch Med, Dept Med, POB 7072, Kampala, Uganda
[5] Case Western Reserve Univ, Dept Med, 10900 Euclid Ave, Cleveland, OH 44106 USA
[6] Case Western Reserve Univ, Dept Populat & Quantitat Hlth Sci, 10900 Euclid Ave, Cleveland, OH 44106 USA
[7] Benaroya Res Inst, Syst Immunol Program, Seattle, WA 98101 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/btad279
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation The identification of differentially expressed genes (DEGs) from transcriptomic datasets is a major avenue of research across diverse disciplines. However, current bioinformatic tools do not support covariance matrices in DEG modeling. Here, we introduce kimma (Kinship In Mixed Model Analysis), an open-source R package for flexible linear mixed effects modeling including covariates, weights, random effects, covariance matrices, and fit metrics.Results In simulated datasets, kimma detects DEGs with similar specificity, sensitivity, and computational time as limma unpaired and dream paired models. Unlike other software, kimma supports covariance matrices as well as fit metrics like Akaike information criterion (AIC). Utilizing genetic kinship covariance, kimma revealed that kinship impacts model fit and DEG detection in a related cohort. Thus, kimma equals or outcompetes current DEG pipelines in sensitivity, computational time, and model complexity.Availability and implementationKimma is freely available on GitHub with an instructional vignette at .
引用
收藏
页数:7
相关论文
共 27 条
[1]   Fitting Linear Mixed-Effects Models Using lme4 [J].
Bates, Douglas ;
Maechler, Martin ;
Bolker, Benjamin M. ;
Walker, Steven C. .
JOURNAL OF STATISTICAL SOFTWARE, 2015, 67 (01) :1-48
[2]   Assessment of kinship detection using RNA-seq data [J].
Blay, Natalia ;
Casas, Eduard ;
Galvan-Femenia, Ivan ;
Graffelman, Jan ;
de Cid, Rafael ;
Vavouri, Tanya .
NUCLEIC ACIDS RESEARCH, 2019, 47 (21) :E136
[3]   Comparing Large Covariance Matrices under Weak Conditions on the Dependence Structure and Its Application to Gene Clustering [J].
Chang, Jinyuan ;
Zhou, Wen ;
Zhou, Wen-Xin ;
Wang, Lan .
BIOMETRICS, 2017, 73 (01) :31-41
[4]   Efficient Variant Set Mixed Model Association Tests for Continuous and Binary Traits in Large-Scale Whole-Genome Sequencing Studies [J].
Chen, Han ;
Huffman, Jennifer E. ;
Brody, Jennifer A. ;
Wang, Chaolong ;
Lee, Seunggeun ;
Li, Zilin ;
Gogarten, Stephanie M. ;
Sofer, Tamar ;
Bielak, Lawrence F. ;
Bis, Joshua C. ;
Blangero, John ;
Bowler, Russell P. ;
Cade, Brian E. ;
Cho, Michael H. ;
Correa, Adolfo ;
Curran, Joanne E. ;
de Vries, Paul S. ;
Glahn, David C. ;
Guo, Xiuqing ;
Johnson, Andrew D. ;
Kardia, Sharon ;
Kooperberg, Charles ;
Lewis, Joshua P. ;
Liu, Xiaoming ;
Mathias, Rasika A. ;
Mitchell, Braxton D. ;
O'Connell, Jeffrey R. ;
Peyser, Patricia A. ;
Post, Wendy S. ;
Reiner, Alex P. ;
Rich, Stephen S. ;
Rotter, Jerome I. ;
Silverman, Edwin K. ;
Smith, Jennifer A. ;
Vasan, Ramachandran S. ;
Wilson, James G. ;
Yanek, Lisa R. ;
Redline, Susan ;
Smith, Nicholas L. ;
Boerwinkle, Eric ;
Borecki, Ingrid B. ;
Cupples, L. Adrienne ;
Laurie, Cathy C. ;
Morrison, Alanna C. ;
Rice, Kenneth M. ;
Lin, Xihong .
AMERICAN JOURNAL OF HUMAN GENETICS, 2019, 104 (02) :260-274
[5]   The human genome project: Lessons from large-scale biology [J].
Collins, FS ;
Morgan, M ;
Patrinos, A .
SCIENCE, 2003, 300 (5617) :286-290
[6]   A survey of best practices for RNA-seq data analysis [J].
Conesa, Ana ;
Madrigal, Pedro ;
Tarazona, Sonia ;
Gomez-Cabrero, David ;
Cervera, Alejandra ;
McPherson, Andrew ;
Szczesniak, Michal Wojciech ;
Gaffney, Daniel J. ;
Elo, Laura L. ;
Zhang, Xuegong ;
Mortazavi, Ali .
GENOME BIOLOGY, 2016, 17
[7]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[8]   dearseq: a variance component score test for RNA-seq differential analysis that effectively controls the false discovery rate [J].
Gauthier, Marine ;
Agniel, Denis ;
Thiebaut, Rodolphe ;
Hejblum, Boris P. .
NAR GENOMICS AND BIOINFORMATICS, 2020, 2 (04)
[9]   Genetic association testing using the GENESIS R/Bioconductor package [J].
Gogarten, Stephanie M. ;
Sofer, Tamar ;
Chen, Han ;
Yu, Chaoyu ;
Brody, Jennifer A. ;
Thornton, Timothy A. ;
Rice, Kenneth M. ;
Conomos, Matthew P. .
BIOINFORMATICS, 2019, 35 (24) :5346-5348
[10]   Dream: powerful differential expression analysis for repeated measures designs [J].
Hoffman, Gabriel E. ;
Roussos, Panos .
BIOINFORMATICS, 2021, 37 (02) :192-201