Scalable and Robust Regression Methods for Phenome-Wide Association Analysis on Large-Scale Biobank Data

被引:5
作者
Bi, Wenjian [1 ,2 ,3 ]
Lee, Seunggeun [4 ]
机构
[1] Peking Univ, Sch Basic Med Sci, Dept Med Genet, Beijing, Peoples R China
[2] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
[4] Seoul Natl Univ, Grad Sch Data Sci, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
phenome-wide association studies; electronic health records-EHR; saddlepoint approximation; biobank data analysis; unbalanced phenotypic distribution; genetic relatedness; mixed model approaches; GENE-ENVIRONMENT INTERACTION; MIXED-MODEL ANALYSIS; RARE VARIANTS; POPULATION-STRUCTURE; COMMON DISEASES; SEQUENCING DATA; TESTS; METAANALYSIS; SURVIVAL; POWER;
D O I
10.3389/fgene.2021.682638
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
With the advances in genotyping technologies and electronic health records (EHRs), large biobanks have been great resources to identify novel genetic associations and gene-environment interactions on a genome-wide and even a phenome-wide scale. To date, several phenome-wide association studies (PheWAS) have been performed on biobank data, which provides comprehensive insights into many aspects of human genetics and biology. Although inspiring, PheWAS on large-scale biobank data encounters new challenges including computational burden, unbalanced phenotypic distribution, and genetic relationship. In this paper, we first discuss these new challenges and their potential impact on data analysis. Then, we summarize approaches that are scalable and robust in GWAS and PheWAS. This review can serve as a practical guide for geneticists, epidemiologists, and other medical researchers to identify genetic variations associated with health-related phenotypes in large-scale biobank data analysis. Meanwhile, it can also help statisticians to gain a comprehensive and up-to-date understanding of the current technical tool development.
引用
收藏
页数:12
相关论文
共 50 条
[21]   PolyGEE: a generalized estimating equation approach to the efficient and robust estimation of polygenic effects in large-scale association studies [J].
Hecker, Julian ;
Prokopenko, Dmitry ;
Lange, Christoph ;
Fier, Heide Loehlein .
BIOSTATISTICS, 2018, 19 (03) :295-306
[22]   Analysis of the brain transcriptome for substance-associated genes: An update on large-scale genome-wide association studies [J].
Zhao, Yihong ;
Han, Xuewei ;
Zheng, Zhi-Liang .
ADDICTION BIOLOGY, 2023, 28 (10)
[23]   Large-Scale Genome-Wide Association Study of East Asians Identifies Loci Associated With Risk for Colorectal Cancer [J].
Lu, Yingchang ;
Kweon, Sun-Seog ;
Tanikawa, Chizu ;
Jia, Wei-Hua ;
Xiang, Yong-Bing ;
Cai, Qiuyin ;
Zeng, Chenjie ;
Schmit, Stephanie L. ;
Shin, Aesun ;
Matsuo, Keitaro ;
Jee, Sun Ha ;
Kim, Dong-Hyun ;
Kim, Jeongseon ;
Wen, Wanqing ;
Shi, Jiajun ;
Guo, Xingyi ;
Li, Bingshan ;
Wang, Nan ;
Zhang, Ben ;
Li, Xinxiang ;
Shin, Min-Ho ;
Li, Hong-Lan ;
Ren, Zefang ;
Oh, Jae Hwan ;
Oze, Isao ;
Ahn, Yoon-Ok ;
Jung, Keum Ji ;
Conti, David V. ;
Schumacher, Fredrick R. ;
Rennert, Gad ;
Jenkins, Mark A. ;
Campbell, Peter T. ;
Hoffmeister, Michael ;
Casey, Graham ;
Gruber, Stephen B. ;
Gao, Jing ;
Gao, Yu-Tang ;
Pan, Zhi-Zhong ;
Kamatani, Yoichiro ;
Zeng, Yi-Xin ;
Shu, Xiao-Ou ;
Long, Jirong ;
Matsuda, Koichi ;
Zheng, Wei .
GASTROENTEROLOGY, 2019, 156 (05) :1455-1466
[24]   Interpretable, Scalable, and Transferrable Functional Projection of Large-Scale Transcriptome Data Using Constrained Matrix Decomposition [J].
Panchy, Nicholas ;
Watanabe, Kazuhide ;
Hong, Tian .
FRONTIERS IN GENETICS, 2021, 12
[25]   Robust meta-analysis for large-scale genomic experiments based on an empirical approach [J].
Sinjini Sikdar .
BMC Medical Research Methodology, 22
[26]   Comparison of HapMap and 1000 Genomes Reference Panels in a Large-Scale Genome-Wide Association Study [J].
de Vries, Paul S. ;
Sabater-Lleal, Maria ;
Chasman, Daniel I. ;
Trompet, Stella ;
Ahluwalia, Tarunveer S. ;
Teumer, Alexander ;
Kleber, Marcus E. ;
Chen, Ming-Huei ;
Wang, Jie Jin ;
Attia, John R. ;
Marioni, Riccardo E. ;
Steri, Maristella ;
Weng, Lu-Chen ;
Pool, Rene ;
Grossmann, Vera ;
Brody, Jennifer A. ;
Venturini, Cristina ;
Tanaka, Toshiko ;
Rose, Lynda M. ;
Oldmeadow, Christopher ;
Mazur, Johanna ;
Basu, Saonli ;
Franberg, Mattias ;
Yang, Qiong ;
Ligthart, Symen ;
Hottenga, Jouke J. ;
Rumley, Ann ;
Mulas, Antonella ;
de Craen, Anton J. M. ;
Grotevendt, Anne ;
Taylor, Kent D. ;
Delgado, Graciela E. ;
Kifley, Annette ;
Lopez, Lorna M. ;
Berentzen, Tina L. ;
Mangino, Massimo ;
Bandinelli, Stefania ;
Morrison, Alanna C. ;
Hamsten, Anders ;
Tofler, Geoffrey ;
de Maat, Moniek P. M. ;
Draisma, Harmen H. M. ;
Lowe, Gordon D. ;
Zoledziewska, Magdalena ;
Sattar, Naveed ;
Lackner, Karl J. ;
Voelker, Uwe ;
McKnight, Barbara ;
Huang, Jie ;
Holliday, Elizabeth G. .
PLOS ONE, 2017, 12 (01)
[27]   Large-scale genome-wide association study of coronary artery disease in genetically diverse populations [J].
Tcheandjieu, Catherine ;
Zhu, Xiang ;
Hilliard, Austin T. ;
Clarke, Shoa L. ;
Napolioni, Valerio ;
Ma, Shining ;
Lee, Kyung Min ;
Fang, Huaying ;
Chen, Fei ;
Lu, Yingchang ;
Tsao, Noah L. ;
Raghavan, Sridharan ;
Koyama, Satoshi ;
Gorman, Bryan R. ;
Vujkovic, Marijana ;
Klarin, Derek ;
Levin, Michael G. ;
Sinnott-Armstrong, Nasa ;
Wojcik, Genevieve L. ;
Plomondon, Mary E. ;
Maddox, Thomas M. ;
Waldo, Stephen W. ;
Bick, Alexander G. ;
Pyarajan, Saiju ;
Huang, Jie ;
Song, Rebecca ;
Ho, Yuk-Lam ;
Buyske, Steven ;
Kooperberg, Charles ;
Haessler, Jeffrey ;
Loos, Ruth J. F. ;
Do, Ron ;
Verbanck, Marie ;
Chaudhary, Kumardeep ;
North, Kari E. ;
Avery, Christy L. ;
Graff, Mariaelisa ;
Haiman, Christopher A. ;
Le Marchand, Loic ;
Wilkens, Lynne R. ;
Bis, Joshua C. ;
Leonard, Hampton ;
Shen, Botong ;
Lange, Leslie A. ;
Giri, Ayush ;
Dikilitas, Ozan ;
Kullo, Iftikhar J. ;
Stanaway, Ian B. ;
Jarvik, Gail P. ;
Gordon, Adam S. .
NATURE MEDICINE, 2022, 28 (08) :1679-+
[28]   Association of dietary folate and vitamin B-12 intake with genome-wide DNA methylation in blood: a large-scale epigenome-wide association analysis in 5841 individuals [J].
Mandaviya, Pooja R. ;
Joehanes, Roby ;
Brody, Jennifer ;
Castillo-Fernandez, Juan E. ;
Dekkers, Koen F. ;
Do, Anh N. ;
Graff, Mariaelisa ;
Hanninen, Ismo K. ;
Tanaka, Toshiko ;
de Jonge, Ester A. L. ;
Kiefte-de Jong, Jessica C. ;
Absher, Devin M. ;
Aslibekyan, Stella ;
de Rijke, Yolanda B. ;
Fornage, Myriam ;
Hernandez, Dena G. ;
Hurme, Mikko A. ;
Ikram, M. Arfan ;
Jacques, Paul F. ;
Justice, Anne E. ;
Kiel, Douglas P. ;
Lemaitre, Rozenn N. ;
Mendelson, Michael M. ;
Mikkila, Vera ;
Moore, Ann Z. ;
Pallister, Tess ;
Raitakari, Olli T. ;
Schalkwijk, Casper G. ;
Sha, Jin ;
Slagboom, Eline P. E. ;
Smith, Caren E. ;
Stehouwer, Coen D. A. ;
Tsai, Pei-Chien ;
Uitterlinden, Andre G. ;
van der Kallen, Carla J. H. ;
van Heemst, Diana ;
Arnett, Donna K. ;
Bandinelli, Stefania ;
Bell, Jordana T. ;
Heijmans, Bastiaan T. ;
Lehtimaki, Terho ;
Levy, Daniel ;
North, Kari E. ;
Sotoodehnia, Nona ;
van Greevenbroek, Marleen M. J. ;
van Meurs, Joyce B. J. ;
Heil, Sandra G. .
AMERICAN JOURNAL OF CLINICAL NUTRITION, 2019, 110 (02) :437-450
[29]   Shared genetics of asthma and mental health disorders: a large-scale genome-wide cross-trait analysis [J].
Zhu, Zhaozhong ;
Zhu, Xi ;
Liu, Cong-Lin ;
Shi, Huwenbo ;
Shen, Sipeng ;
Yang, Yunqi ;
Hasegawa, Kohei ;
Camargo, Carlos A., Jr. ;
Liang, Liming .
EUROPEAN RESPIRATORY JOURNAL, 2019, 54 (06)
[30]   Stratified false discovery control for large-scale hypothesis testing with application to genome-wide association studies [J].
Sun, Lei ;
Craiu, Radu V. ;
Paterson, Andrew D. ;
Bull, Shelley B. .
GENETIC EPIDEMIOLOGY, 2006, 30 (06) :519-530