Transfer Learning in Genome-Wide Association Studies with Knockoffs

被引:0
作者
Li, Shuangning [1 ]
Ren, Zhimei [2 ]
Sabatti, Chiara [3 ,4 ]
Sesia, Matteo [5 ]
机构
[1] Harvard Univ, Dept Stat, Stanford, CA 94305 USA
[2] Univ Chicago, Dept Stat, Chicago, IL 60637 USA
[3] Stanford Univ, Dept Biomed Data Sci, Stanford, CA 94305 USA
[4] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[5] Univ Southern Calif, Dept Data Sci & Operat, Los Angeles, CA 90089 USA
来源
SANKHYA-SERIES B-APPLIED AND INTERDISCIPLINARY STATISTICS | 2022年
关键词
False discovery rate; model selection; population structure; Primary; 62; FALSE DISCOVERY RATE; DEMOGRAPHIC HISTORY; TRAITS; POWER;
D O I
10.1007/s13571-022-00297-y
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper presents and compares alternative transfer learning methods that can increase the power of conditional testing via knockoffs by leveraging prior information in external data sets collected from different populations or measuring related outcomes. The relevance of this methodology is explored in particular within the context of genome-wide association studies, where it can be helpful to address the pressing need for principled ways to suitably account for, and efficiently learn from the genetic variation associated to diverse ancestries. Finally, we apply these methods to analyze several phenotypes in the UK Biobank data set, demonstrating that transfer learning helps knockoffs discover more associations in the data collected from minority populations, potentially opening the way to the development of more accurate polygenic risk scores.
引用
收藏
页数:39
相关论文
共 36 条
  • [1] Integrating common and rare genetic variation in diverse human populations
    Altshuler, David M.
    Gibbs, Richard A.
    Peltonen, Leena
    Dermitzakis, Emmanouil
    Schaffner, Stephen F.
    Yu, Fuli
    Bonnen, Penelope E.
    de Bakker, Paul I. W.
    Deloukas, Panos
    Gabriel, Stacey B.
    Gwilliam, Rhian
    Hunt, Sarah
    Inouye, Michael
    Jia, Xiaoming
    Palotie, Aarno
    Parkin, Melissa
    Whittaker, Pamela
    Chang, Kyle
    Hawes, Alicia
    Lewis, Lora R.
    Ren, Yanru
    Wheeler, David
    Muzny, Donna Marie
    Barnes, Chris
    Darvishi, Katayoon
    Hurles, Matthew
    Korn, Joshua M.
    Kristiansson, Kati
    Lee, Charles
    McCarroll, Steven A.
    Nemesh, James
    Keinan, Alon
    Montgomery, Stephen B.
    Pollack, Samuela
    Price, Alkes L.
    Soranzo, Nicole
    Gonzaga-Jauregui, Claudia
    Anttila, Verneri
    Brodeur, Wendy
    Daly, Mark J.
    Leslie, Stephen
    McVean, Gil
    Moutsianas, Loukas
    Nguyen, Huy
    Zhang, Qingrun
    Ghori, Mohammed J. R.
    McGinnis, Ralph
    McLaren, William
    Takeuchi, Fumihiko
    Grossman, Sharon R.
    [J]. NATURE, 2010, 467 (7311) : 52 - 58
  • [2] CONTROLLING THE FALSE DISCOVERY RATE VIA KNOCKOFFS
    Barber, Rina Foygel
    Candes, Emmanuel J.
    [J]. ANNALS OF STATISTICS, 2015, 43 (05) : 2055 - 2085
  • [3] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [4] Screening for Partial Conjunction Hypotheses
    Benjamini, Yoav
    Heller, Ruth
    [J]. BIOMETRICS, 2008, 64 (04) : 1215 - 1222
  • [5] The UK Biobank resource with deep phenotyping and genomic data
    Bycroft, Clare
    Freeman, Colin
    Petkova, Desislava
    Band, Gavin
    Elliott, Lloyd T.
    Sharp, Kevin
    Motyer, Allan
    Vukcevic, Damjan
    Delaneau, Olivier
    O'Connell, Jared
    Cortes, Adrian
    Welsh, Samantha
    Young, Alan
    Effingham, Mark
    McVean, Gil
    Leslie, Stephen
    Allen, Naomi
    Donnelly, Peter
    Marchini, Jonathan
    [J]. NATURE, 2018, 562 (7726) : 203 - +
  • [6] Panning for gold: "model-X' knockoffs for high dimensional controlled variable selection
    Candes, Emmanuel
    Fan, Yingying
    Janson, Lucas
    Lv, Jinchi
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2018, 80 (03) : 551 - 577
  • [7] Leveraging Multi-ethnic Evidence for Risk Assessment of Quantitative Traits in Minority Populations
    Coram, Marc A.
    Fang, Huaying
    Candille, Sophie I.
    Assimes, Themistocles L.
    Tang, Hua
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2017, 101 (02) : 218 - 226
  • [8] Analysis of polygenic risk score usage and performance in diverse human populations
    Duncan, L.
    Shen, H.
    Gelaye, B.
    Meijsen, J.
    Ressler, K.
    Feldman, M.
    Peterson, R.
    Domingue, B.
    [J]. NATURE COMMUNICATIONS, 2019, 10 (1)
  • [9] Adaptive p-value weighting with power optimality
    Durand, Guillermo
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2019, 13 (02): : 3336 - 3385
  • [10] False discovery control with p-value weighting
    Genovese, Christopher R.
    Roeder, Kathryn
    Wasserman, Larry
    [J]. BIOMETRIKA, 2006, 93 (03) : 509 - 524