locStra: Fast analysis of regional/global stratification in whole-genome sequencing studies

被引:6
作者
Hahn, Georg [1 ]
Lutz, Sharon M. [1 ]
Hecker, Julian [2 ]
Prokopenko, Dmitry [3 ]
Cho, Michael H. [2 ]
Silverman, Edwin K. [2 ]
Weiss, Scott T. [2 ]
Lange, Christoph [1 ]
机构
[1] Harvard Univ, TH Chan Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[2] Harvard Univ, Brigham & Womens Hosp, Dept Med, Boston, MA 02115 USA
[3] Harvard Univ, Massachusetts Gen Hosp, Boston, MA USA
关键词
regional analysis; population stratification; population substructure; similarity matrix; whole-genome sequencing; GENETIC ASSOCIATION ANALYSIS; LOCAL-ANCESTRY; RARE VARIANTS; INFERENCE; LINKAGE;
D O I
10.1002/gepi.22356
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
locStrais anR-package for the analysis of regional and global population stratification in whole-genome sequencing (WGS) studies, where regional stratification refers to the substructure defined by the loci in a particular region on the genome. Population substructure can be assessed based on the genetic covariance matrix, the genomic relationship matrix, and the unweighted/weighted genetic Jaccard similarity matrix. Using a sliding window approach, the regional similarity matrices are compared with the global ones, based on user-defined window sizes and metrics, for example, the correlation between regional and global eigenvectors. An algorithm for the specification of the window size is provided. As the implementation fully exploits sparse matrix algebra and is written in C++, the analysis is highly efficient. Even on single cores, for realistic study sizes (several thousand subjects, several million rare variants per subject), the runtime for the genome-wide computation of all regional similarity matrices does typically not exceed one hour, enabling an unprecedented investigation of regional stratification across the entire genome. The package is applied to three WGS studies, illustrating the varying patterns of regional substructure across the genome and its beneficial effects on association testing.
引用
收藏
页码:82 / 98
页数:17
相关论文
共 50 条
[41]   Integrated whole-genome sequencing and temporospatial analysis of a continuing Group A Streptococcus epidemic [J].
Fittipaldi, Nahuel ;
Tyrrell, Gregory J. ;
Low, Donald E. ;
Martin, Irene ;
Lin, David ;
Hari, Kumar L. ;
Musser, James M. .
EMERGING MICROBES & INFECTIONS, 2013, 2
[42]   Whole-Genome Sequencing Analysis to Identify Infection with Multiple Species of Nontuberculous Mycobacteria [J].
Khieu, Visal ;
Ananta, Pimjai ;
Kaewprasert, Orawee ;
Laohaviroj, Marut ;
Namwat, Wises ;
Faksri, Kiatichai .
PATHOGENS, 2021, 10 (07)
[43]   Patient perspectives on whole-genome sequencing for undiagnosed diseases [J].
Boeldt, Debra L. ;
Cheung, Cynthia ;
Ariniello, Lauren ;
Darst, Burcu F. ;
Topol, Sarah ;
Schork, Nicholas J. ;
Philis-Tsimikas, Athena ;
Torkamani, Ali ;
Fortmann, Addie L. ;
Bloss, Cinnamon S. .
PERSONALIZED MEDICINE, 2017, 14 (01) :17-25
[44]   The mutational burden of acral melanoma revealed by whole-genome sequencing and comparative analysis [J].
Furney, Simon J. ;
Turajlic, Samra ;
Stamp, Gordon ;
Thomas, J. Meirion ;
Hayes, Andrew ;
Strauss, Dirk ;
Gavrielides, Mike ;
Xing, Wei ;
Gore, Martin ;
Larkin, James ;
Marais, Richard .
PIGMENT CELL & MELANOMA RESEARCH, 2014, 27 (05) :835-838
[45]   Plasmid Classification in an Era of Whole-Genome Sequencing: Application in Studies of Antibiotic Resistance Epidemiology [J].
Orlek, Alex ;
Stoesser, Nicole ;
Anjum, Muna F. ;
Doumith, Michel ;
Ellington, Matthew J. ;
Peto, Tim ;
Crook, Derrick ;
Woodford, Neil ;
Walker, A. Sarah ;
Phan, Hang ;
Sheppard, Anna E. .
FRONTIERS IN MICROBIOLOGY, 2017, 8
[46]   Analysis of bovine tuberculosis transmission in Jalisco, Mexico through whole-genome sequencing [J].
Verdugo Escarcega, Dulce Anahy ;
Perea Razo, Claudia Angelica ;
Gonzalez Ruiz, Sara ;
Sosa Gallegos, Susana Lucia ;
Milian Suazo, Feliciano ;
Canto Alarcon, Germinal Jorge .
JOURNAL OF VETERINARY RESEARCH, 2020, 64 (01) :51-61
[47]   Cost-effective low-coverage whole-genome sequencing assay for the risk stratification of gastric cancer [J].
Ye, Li-Ping ;
Mao, Xin-Li ;
Zhou, Xian-Bin ;
Wang, Yi ;
Xu, Shi-Wen ;
He, Sai-Qin ;
Qian, Zi-Liang ;
Zhang, Xiao-Gang ;
Zhai, Li-Juan ;
Peng, Jin-Bang ;
Gu, Bin-Bin ;
Jin, Xiu-Xiu ;
Song, Ya-Qi ;
Li, Shao-Wei .
WORLD JOURNAL OF GASTROINTESTINAL ONCOLOGY, 2022, 14 (03) :690-702
[48]   Demographic responses of oceanic island birds to local and regional ecological disruptions revealed by whole-genome sequencing [J].
Gabrielli, Maeva ;
Leroy, Thibault ;
Salmona, Jordi ;
Nabholz, Benoit ;
Mila, Borja ;
Thebaud, Christophe .
MOLECULAR ECOLOGY, 2024, 33 (04)
[49]   Advantages and Perils of Clinical Whole-Exome and Whole-Genome Sequencing in Cardiomyopathy [J].
Francesco Mazzarotto ;
Iacopo Olivotto ;
Roddy Walsh .
Cardiovascular Drugs and Therapy, 2020, 34 :241-253
[50]   Clinical interpretation of whole-genome and whole-transcriptome sequencing for precision oncology [J].
Jobanputra, Vaidehi ;
Wrzeszczynski, Kazimierz O. ;
Buttner, Reinhard ;
Caldas, Carlos ;
Cuppen, Edwin ;
Grimmond, Sean ;
Haferlach, Torsten ;
Mullighan, Charles ;
Schuh, Anna ;
Elemento, Olivier .
SEMINARS IN CANCER BIOLOGY, 2022, 84 :23-31