MaSk-LMM: A Matrix Sketching Framework for Linear Mixed Models in Association Studies

被引:0
作者
Burch, Myson [1 ]
Bose, Aritra [1 ]
Dexter, Gregory [2 ]
Parida, Laxmi [1 ]
Drineas, Petros [2 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
来源
RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, RECOMB 2024 | 2024年 / 14758卷
关键词
Linear Mixed Models; Matrix Sketching; GWAS;
D O I
10.1007/978-1-0716-3989-4_29
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Linear mixed models have been widely used in genome-wide association studies to control for population stratification and cryptic relatedness. Unfortunately, estimating LMM parameters is computationally expensive, necessitating large-scale matrix operations to build the genetic relatedness matrix. Randomized Linear Algebra has provided alternative approaches to such matrix operations by leveraging matrix sketching, which often results in provably accurate fast and efficient approximations. We leverage matrix sketching to develop a fast and efficient LMM method called Matrix-Sketching LMM (MaSk-LMM) by sketching the genotype matrix to reduce its dimensions and speed up computations. Our framework provides theoretical guarantees and a strong empirical performance compared to current methods.
引用
收藏
页码:352 / 355
页数:4
相关论文
共 10 条
[1]   Structure-informed clustering for population stratification in association studies [J].
Bose, Aritra ;
Burch, Myson ;
Chowdhury, Agniva ;
Paschou, Peristera ;
Drineas, Petros .
BMC BIOINFORMATICS, 2023, 24 (01)
[2]   Integrating Linguistics, Social Structure, and Geography to Model Genetic Diversity within India [J].
Bose, Aritra ;
Platt, Daniel E. ;
Parida, Laxmi ;
Drineas, Petros ;
Paschou, Peristera .
MOLECULAR BIOLOGY AND EVOLUTION, 2021, 38 (05) :1809-1819
[3]   TeraPCA: a fast and scalable software package to study genetic variation in tera-scale genotypes [J].
Bose, Aritra ;
Kalantzis, Vassilis ;
Kontopoulou, Eugenia-Maria ;
Elkady, Mai ;
Paschou, Peristera ;
Drineas, Petros .
BIOINFORMATICS, 2019, 35 (19) :3679-3683
[4]  
Lippert C, 2011, NAT METHODS, V8, P833, DOI [10.1038/nmeth.1681, 10.1038/NMETH.1681]
[5]   MegaLMM: Mega-scale linear mixed models for genomic predictions with thousands of traits [J].
Runcie, Daniel E. ;
Qu, Jiayi ;
Cheng, Hao ;
Crawford, Lorin .
GENOME BIOLOGY, 2021, 22 (01)
[6]   Fast and flexible linear mixed models for genome-wide genetics [J].
Runcie, Daniel E. ;
Crawford, Lorin .
PLOS GENETICS, 2019, 15 (02)
[7]   Sketching as a Tool for Numerical Linear Algebra [J].
Woodruff, David P. .
FOUNDATIONS AND TRENDS IN THEORETICAL COMPUTER SCIENCE, 2014, 10 (1-2) :1-157
[8]   Exploring efficient linear mixed models to detect quantitative trait locus-by-environment interactions [J].
Yamamoto, Eiji ;
Matsunaga, Hiroshi .
G3-GENES GENOMES GENETICS, 2021, 11 (08)
[9]   Advantages and pitfalls in the application of mixed-model association methods [J].
Yang, Jian ;
Zaitlen, Noah A. ;
Goddard, Michael E. ;
Visscher, Peter M. ;
Price, Alkes L. .
NATURE GENETICS, 2014, 46 (02) :100-106
[10]   GCTA: A Tool for Genome-wide Complex Trait Analysis [J].
Yang, Jian ;
Lee, S. Hong ;
Goddard, Michael E. ;
Visscher, Peter M. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2011, 88 (01) :76-82