Genome-wide analysis of common and rare variants via multiple knockoffs at biobank scale, with an application to Alzheimer disease genetics

被引:12
作者
He, Zihuai [1 ,2 ]
Guen, Yann Le [1 ,3 ]
Liu, Linxi [4 ]
Lee, Justin [2 ]
Ma, Shiyang [5 ]
Yang, Andrew C. [1 ]
Liu, Xiaoxia [1 ]
Rutledge, Jarod [6 ]
Losada, Patricia Moran [1 ]
Song, Bowen [7 ]
Belloy, Michael E. [1 ]
Butler, Robert R., III [1 ]
Longo, Frank M. [1 ]
Tang, Hua [6 ]
Mormino, Elizabeth C. [1 ]
Wyss-Coray, Tony [1 ]
Greicius, Michael D. [1 ]
Ionita-Laza, Iuliana [5 ]
机构
[1] Stanford Univ, Dept Neurol & Neurol Sci, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Med, Quantitat Sci Unit, Stanford, CA 94305 USA
[3] Inst Cerveau Paris Brain Inst ICM, F-75013 Paris, France
[4] Univ Pittsburgh, Dept Stat, Pittsburgh, PA 15260 USA
[5] Columbia Univ, Dept Biostat, New York, NY 10032 USA
[6] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
[7] Stanford Univ, Inst Computat & Math Engn, Stanford, CA 94305 USA
基金
英国医学研究理事会; 英国惠康基金;
关键词
EXPRESSION; CELLS; MODEL; LOCI;
D O I
10.1016/j.ajhg.2021.10.009
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Knockoff-based methods have become increasingly popular due to their enhanced power for locus discovery and their ability to prioritize putative causal variants in a genome-wide analysis. However, because of the substantial computational cost for generating knockoffs, existing knockoff approaches cannot analyze millions of rare genetic variants in biobank-scale whole-genome sequencing and whole-genome imputed datasets. We propose a scalable knockoff-based method for the analysis of common and rare variants across the genome, KnockoffScreen-AL, that is applicable to biobank-scale studies with hundreds of thousands of samples and millions of genetic variants. The application of KnockoffScreen-AL to the analysis of Alzheimer disease (AD) in 388,051 WG-imputed samples from the UK Biobank resulted in 31 significant loci, including 14 loci that are missed by conventional association tests on these data. We perform replication studies in an independent meta-analysis of clinically diagnosed AD with 94,437 samples, and additionally leverage single-cell RNA-sequencing data with 143,793 single-nucleus transcriptomes from 17 control subjects and AD-affected individuals, and proteomics data from 735 control subjects and affected indviduals with AD and related disorders to validate the genes at these significant loci. These multi-omics analyses show that 79.1% of the proximal genes at these loci and 76.2% of the genes at loci identified only by KnockoffScreen-AL exhibit at least suggestive signal (p < 0.05) in the scRNA-seq or proteomics analyses. We highlight a potentially causal gene in AD progression, EGFR, that shows significant differences in expression and protein levels between AD-affected individuals and healthy control subjects.
引用
收藏
页码:2336 / 2353
页数:18
相关论文
共 52 条
[1]   Genetic effects on gene expression across human tissues [J].
Aguet, Francois ;
Brown, Andrew A. ;
Castel, Stephane E. ;
Davis, Joe R. ;
He, Yuan ;
Jo, Brian ;
Mohammadi, Pejman ;
Park, Yoson ;
Parsana, Princy ;
Segre, Ayellet V. ;
Strober, Benjamin J. ;
Zappala, Zachary ;
Cummings, Beryl B. ;
Gelfand, Ellen T. ;
Hadley, Kane ;
Huang, Katherine H. ;
Lek, Monkol ;
Li, Xiao ;
Nedzel, Jared L. ;
Nguyen, Duyen Y. ;
Noble, Michael S. ;
Sullivan, Timothy J. ;
Tukiainen, Taru ;
MacArthur, Daniel G. ;
Getz, Gad ;
Management, Nih Program ;
Addington, Anjene ;
Guan, Ping ;
Koester, Susan ;
Little, A. Roger ;
Lockhart, Nicole C. ;
Moore, Helen M. ;
Rao, Abhi ;
Struewing, Jeffery P. ;
Volpi, Simona ;
Collection, Biospecimen ;
Brigham, Lori E. ;
Hasz, Richard ;
Hunter, Marcus ;
Johns, Christopher ;
Johnson, Mark ;
Kopen, Gene ;
Leinweber, William F. ;
Lonsdale, John T. ;
McDonald, Alisa ;
Mestichelli, Bernadette ;
Myer, Kevin ;
Roe, Bryan ;
Salvatore, Michael ;
Shad, Saboor .
NATURE, 2017, 550 (7675) :204-+
[2]   A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[3]   Finding haplotype block boundaries by using the minimum-description-length principle [J].
Anderson, EC ;
Novembre, J .
AMERICAN JOURNAL OF HUMAN GENETICS, 2003, 73 (02) :336-354
[4]   Augmented implicitly restarted Lanczos bidiagonalization methods [J].
Baglama, J ;
Reichel, L .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2005, 27 (01) :19-42
[5]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[6]   Reconstruction of the human blood-brain barrier in vitro reveals a pathogenic mechanism of APOE4 in pericytes [J].
Blanchard, Joel W. ;
Bula, Michael ;
Davila-Velderrain, Jose ;
Akay, Leyla Anne ;
Zhu, Lena ;
Frank, Alexander ;
Victor, Matheus B. ;
Bonner, Julia Maeve ;
Mathys, Hansruedi ;
Lin, Yuan-Ta ;
Ko, Tak ;
Bennett, David A. ;
Cam, Hugh P. ;
Kellis, Manolis ;
Tsai, Li-Huei .
NATURE MEDICINE, 2020, 26 (06) :952-+
[7]   The UK Biobank resource with deep phenotyping and genomic data [J].
Bycroft, Clare ;
Freeman, Colin ;
Petkova, Desislava ;
Band, Gavin ;
Elliott, Lloyd T. ;
Sharp, Kevin ;
Motyer, Allan ;
Vukcevic, Damjan ;
Delaneau, Olivier ;
O'Connell, Jared ;
Cortes, Adrian ;
Welsh, Samantha ;
Young, Alan ;
Effingham, Mark ;
McVean, Gil ;
Leslie, Stephen ;
Allen, Naomi ;
Donnelly, Peter ;
Marchini, Jonathan .
NATURE, 2018, 562 (7726) :203-+
[8]   Panning for gold: "model-X' knockoffs for high dimensional controlled variable selection [J].
Candes, Emmanuel ;
Fan, Yingying ;
Janson, Lucas ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2018, 80 (03) :551-577
[9]   A role for ErbB signaling in the induction of reactive astrogliosis [J].
Chen, Jing ;
He, Wanwan ;
Hu, Xu ;
Shen, Yuwen ;
Cao, Junyan ;
Wei, Zhengdong ;
Luan, Yifei ;
He, Li ;
Jiang, Fangdun ;
Tao, Yanmei .
CELL DISCOVERY, 2017, 3
[10]   ToppGene Suite for gene list enrichment analysis and candidate gene prioritization [J].
Chen, Jing ;
Bardes, Eric E. ;
Aronow, Bruce J. ;
Jegga, Anil G. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W305-W311