Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort

被引:158
作者
Prive, Florian [1 ]
Aschard, Hugues [2 ,3 ]
Carmi, Shai [4 ]
Folkersen, Lasse [5 ]
Hoggart, Clive [6 ]
O'Reilly, Paul F. [6 ]
Vilhjalmsson, Bjarni J. [1 ,7 ]
机构
[1] Aarhus Univ, Natl Ctr Register Based Res, DK-8210 Aarhus, Denmark
[2] Inst Pasteur, Dept Computat Biol, F-75015 Paris, France
[3] Harvard TH Chan Sch Publ Hlth, Program Genet Epidemiol & Stat Genet, Boston, MA 02115 USA
[4] Hebrew Univ Jerusalem, Braun Sch Publ Hlth & Community Med, IL-9112102 Jerusalem, Israel
[5] Danish Natl Genome Ctr, DK-2300 Copenhagen, Denmark
[6] Icahn Sch Med Mt Sinai, Dept Genet & Genom Sci, New York, NY 10029 USA
[7] Aarhus Univ, Bioinformat Res Ctr, DK-8000 Aarhus, Denmark
基金
新加坡国家研究基金会;
关键词
RISK SCORES; POPULATION-STRUCTURE; WIDE ASSOCIATION; PREDICTION; INFERENCE; TRAITS; SNP;
D O I
10.1016/j.ajhg.2021.11.008
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The low portability of polygenic scores (PGSs) across global populations is a major concern that must be addressed before PGSs can be used for everyone in the clinic. Indeed, prediction accuracy has been shown to decay as a function of the genetic distance between the training and test cohorts. However, such cohorts differ not only in their genetic distance but also in their geographical distance and their data collection and assaying, conflating multiple factors. In this study, we examine the extent to which PGSs are transferable between ancestries by deriving polygenic scores for 245 curated traits from the UK Biobank data and applying them in nine ancestry groups from the same cohort. By restricting both training and testing to the UK Biobank data, we reduce the risk of environmental and genotyping confounding from using different cohorts. We define the nine ancestry groups at a sub-continental level, based on a simple, robust, and effective method that we introduce here. We then apply two different predictive methods to derive polygenic scores for all 245 phenotypes and show a systematic and dramatic reduction in portability of PGSs trained using Northwestern European individuals and applied to nine ancestry groups. These analyses demonstrate that prediction already drops off within European ancestries and reduces globally in proportion to genetic distance. Altogether, our study provides unique and robust insights into the PGS portability problem.
引用
收藏
页码:12 / 23
页数:12
相关论文
共 59 条
[1]   FlashPCA2: principal component analysis of Biobank-scale genotype datasets [J].
Abraham, Gad ;
Qiu, Yixuan ;
Inouye, Michael .
BIOINFORMATICS, 2017, 33 (17) :2776-2778
[2]   Accurate and Robust Genomic Prediction of Celiac Disease Using Statistical Learning [J].
Abraham, Gad ;
Tye-Din, Jason A. ;
Bhalala, Oneil G. ;
Kowalczyk, Adam ;
Zobel, Justin ;
Inouye, Michael .
PLOS GENETICS, 2014, 10 (02)
[3]   Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction [J].
Albinana, Clara ;
Grove, Jakob ;
McGrath, John J. ;
Agerbo, Esben ;
Wray, Naomi R. ;
Bulik, Cynthia M. ;
Nordentoft, Merete ;
Hougaard, David M. ;
Werge, Thomas ;
Borglum, Anders D. ;
Mortensen, Preben Bo ;
Prive, Florian ;
Vilhjalmsson, Bjarni J. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2021, 108 (06) :1001-1011
[4]   Fast model-based estimation of ancestry in unrelated individuals [J].
Alexander, David H. ;
Novembre, John ;
Lange, Kenneth .
GENOME RESEARCH, 2009, 19 (09) :1655-1664
[5]   No Evidence from Genome-wide Data of a Khazar Origin for the Ashkenazi Jews [J].
Behar, Doron M. ;
Metspalu, Mait ;
Baran, Yael ;
Kopelman, Naama M. ;
Yunusbayev, Bayazit ;
Gladstein, Ariella ;
Tzur, Shay ;
Sahakyan, Hovhannes ;
Bahmanimehr, Ardeshir ;
Yepiskoposyan, Levon ;
Tambets, Kristiina ;
Khusnutdinova, Elza K. ;
Kushniarevich, Alena ;
Balanovsky, Oleg ;
Balanovsky, Elena ;
Kovacevic, Lejla ;
Marjanovic, Damir ;
Mihailov, Evelin ;
Kouvatsi, Anastasia ;
Triantaphyllidis, Costas ;
King, Roy J. ;
Semino, Ornella ;
Torroni, Antonio ;
Hammer, Michael F. ;
Metspalu, Ene ;
Skorecki, Karl ;
Rosset, Saharon ;
Halperin, Eran ;
Villems, Richard ;
Rosenberg, Noah A. .
HUMAN BIOLOGY, 2013, 85 (06) :859-900
[6]  
Bengtsson H.., 2021, ARXIV200800553
[7]   Reduced signal for polygenic adaptation of height in UK Biobank [J].
Berg, Jeremy J. ;
Harpak, Arbel ;
Sinnott-Armstrong, Nasa ;
Joergensen, Anja Moltke ;
Mostafavi, Hakhamanesh ;
Field, Yair ;
Boyle, Evan August ;
Zhang, Xinjun ;
Racimo, Fernando ;
Pritchard, Jonathan K. ;
Coop, Graham .
ELIFE, 2019, 8
[8]   Polygenic Scores for Height in Admixed Populations [J].
Bitarello, Barbara D. ;
Mathieson, Iain .
G3-GENES GENOMES GENETICS, 2020, 10 (11) :4027-4036
[9]  
Bybjerg-Grauholm J., 2020, IPSYCH2015 CASE COHO, DOI [DOI 10.1101/2020.11.30.20237768, :10.1101/2020.11.30.20237768]
[10]   The UK Biobank resource with deep phenotyping and genomic data [J].
Bycroft, Clare ;
Freeman, Colin ;
Petkova, Desislava ;
Band, Gavin ;
Elliott, Lloyd T. ;
Sharp, Kevin ;
Motyer, Allan ;
Vukcevic, Damjan ;
Delaneau, Olivier ;
O'Connell, Jared ;
Cortes, Adrian ;
Welsh, Samantha ;
Young, Alan ;
Effingham, Mark ;
McVean, Gil ;
Leslie, Stephen ;
Allen, Naomi ;
Donnelly, Peter ;
Marchini, Jonathan .
NATURE, 2018, 562 (7726) :203-+