Participation bias in the UK Biobank distorts genetic associations and downstream analyses

被引:173
作者
Schoeler, Tabea [1 ,2 ]
Speed, Doug [3 ]
Porcu, Eleonora [4 ,5 ]
Pirastu, Nicola [6 ]
Pingault, Jean-Baptiste [2 ,7 ]
Kutalik, Zoltan [1 ,8 ,9 ]
机构
[1] Univ Lausanne, Dept Computat Biol, Lausanne, Switzerland
[2] UCL, Dept Clin Educ & Hlth Psychol, London, England
[3] Aarhus Univ, Quantitat Genet & Genom, Aarhus, Denmark
[4] Lausanne Univ Hosp, Biomed Data Sci Ctr, Precis Med Unit, Lausanne, Switzerland
[5] Univ Lausanne, Lausanne, Switzerland
[6] Human Technopole, Genom Res Ctr, Milan, Italy
[7] Kings Coll London, Inst Psychiat Psychol & Neurosci, Social Genet & Dev Psychiat Ctr, London, England
[8] Swiss Inst Bioinformat, Lausanne, Switzerland
[9] Univ Ctr Primary Care & Publ Hlth, Lausanne, Switzerland
基金
英国惠康基金; 欧洲研究理事会; 瑞士国家科学基金会;
关键词
MORTALITY; HEALTH; REPRESENTATIVENESS; CANCER; SCORE; RISK;
D O I
10.1038/s41562-023-01579-9
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
While volunteer-based studies such as the UK Biobank have become the cornerstone of genetic epidemiology, the participating individuals are rarely representative of their target population. To evaluate the impact of selective participation, here we derived UK Biobank participation probabilities on the basis of 14 variables harmonized across the UK Biobank and a representative sample. We then conducted weighted genome-wide association analyses on 19 traits. Comparing the output from weighted genome-wide association analyses (n(effective) = 94,643 to 102,215) with that from standard genome-wide association analyses (n = 263,464 to 283,749), we found that increasing representativeness led to changes in SNP effect sizes and identified novel SNP associations for 12 traits. While heritability estimates were less impacted by weighting (maximum change in h(2), 5%), we found substantial discrepancies for genetic correlations (maximum change in r(g), 0.31) and Mendelian randomization estimates (maximum change in beta(STD), 0.15) for socio-behavioural traits. We urge the field to increase representativeness in biobank samples, especially when studying genetic correlates of behaviour, lifestyles and social outcomes. The authors use information on 14 traits and create a representative pseudo-sample of the UK Biobank population, showing that participation bias distorts behavioural genome-wide association study and Mendelian randomization findings.
引用
收藏
页码:1216 / +
页数:15
相关论文
共 52 条
[1]   Dissecting polygenic signals from genome-wide association studies on human behaviour [J].
Abdellaoui, Abdel ;
Verweij, Karin J. H. .
NATURE HUMAN BEHAVIOUR, 2021, 5 (06) :686-694
[2]  
Adam Yagoub, 2021, F1000Res, V10, P1002, DOI 10.12688/f1000research.53962.1
[3]   Factors associated with sharing e-mail information and mental health survey participation in large population cohorts [J].
Adams, Mark J. ;
Hill, W. David ;
Howard, David M. ;
Dashti, Hassan S. ;
Davis, Katrina A. S. ;
Campbell, Archie ;
Clarke, Toni-Kim ;
Deary, Ian J. ;
Hayward, Caroline ;
Porteous, David ;
Hotopf, Matthew ;
McIntosh, Andrew M. .
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2020, 49 (02) :410-421
[4]   UK Biobank: Current status and what it means for epidemiology [J].
Allen, Naomi ;
Sudlow, Cathie ;
Downey, Paul ;
Peakman, Tim ;
Danesh, John ;
Elliott, Paul ;
Gallacher, John ;
Green, Jane ;
Matthews, Paul ;
Pell, Jill ;
Sprosen, Tim ;
Collins, Rory .
HEALTH POLICY AND TECHNOLOGY, 2012, 1 (03) :123-126
[5]   An integrated map of genetic variation from 1,092 human genomes [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Schmidt, Jeanette P. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Dinh, Huyen ;
Kovar, Christie ;
Lee, Sandra ;
Lewis, Lora ;
Muzny, Donna ;
Reid, Jeff ;
Wang, Min ;
Wang, Jun ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Li, Zhuo ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Su, Zhe ;
Tai, Shuaishuai ;
Tang, Meifang .
NATURE, 2012, 491 (7422) :56-65
[6]  
[Anonymous], 2018, Health Survey for England 2017 Trend Tables
[7]  
[Anonymous], 2011, 2011 Census microdata
[8]   Comparison of risk factor associations in UK Biobank against representative, general population based studies with conventional response rates: prospective cohort study and individual participant meta-analysis [J].
Batty, G. David ;
Gale, Catharine R. ;
Kivimaki, Mika ;
Deary, Ian J. ;
Bell, Steven .
BMJ-BRITISH MEDICAL JOURNAL, 2020, 368
[9]  
Benonisdottir S., 2022, BIORXIV, DOI [10.1101/2022.02.11.480067, DOI 10.1101/2022.02.11.480067]
[10]   MORTALITY AND CANCER RATES IN NONRESPONDENTS TO A PROSPECTIVE-STUDY OF OLDER WOMEN - 5-YEAR FOLLOW-UP [J].
BISGARD, KM ;
FOLSOM, AR ;
HONG, CP ;
SELLERS, TA .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 1994, 139 (10) :990-1000