A reference haplotype panel for genome-wide imputation of short tandem repeats

被引:46
|
作者
Saini, Shubham [1 ]
Mitra, Ileena [2 ]
Mousavi, Nima [3 ]
Fotsing, Stephanie Feupe [2 ,4 ]
Gymrek, Melissa [1 ,5 ]
机构
[1] Univ Calif San Diego, Dept Comp Sci & Engn, 9500 Gilman Dr, La Jolla, CA 92093 USA
[2] Univ Calif San Diego, Bioinformat & Syst Biol Program, 9500 Gilman Dr, La Jolla, CA 92093 USA
[3] Univ Calif San Diego, Dept Elect & Comp Engn, 9500 Gilman Dr, La Jolla, CA 92093 USA
[4] Univ Calif San Diego, Dept Biomed Informat, 9500 Gilman Dr, La Jolla, CA 92093 USA
[5] Univ Calif San Diego, Dept Med, 9500 Gilman Dr, La Jolla, CA 92093 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
GENE-EXPRESSION VARIATION; LINKAGE DISEQUILIBRIUM; DNA METHYLATION; CAG REPEAT; EXPANSION; MICROSATELLITE; VARIANTS; MUTATION; DISEASE; ASSOCIATION;
D O I
10.1038/s41467-018-06694-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Short tandem repeats (STRs) are involved in dozens of Mendelian disorders and have been implicated in complex traits. However, genotyping arrays used in genome-wide association studies focus on single nucleotide polymorphisms (SNPs) and do not readily allow identification of STR associations. We leverage next-generation sequencing (NGS) from 479 families to create a SNP + STR reference haplotype panel. Our panel enables imputing STR genotypes into SNP array data when NGS is not available for directly genotyping STRs. Imputed genotypes achieve mean concordance of 97% with observed genotypes in an external dataset compared to 71% expected under a naive model. Performance varies widely across STRs, with near perfect concordance at bi-allelic STRs vs. 70% at highly polymorphic repeats. Imputation increases power over individual SNPs to detect STR associations with gene expression. Imputing STRs into existing SNP datasets will enable the first large-scale STR association studies across a range of complex traits.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Genome-Wide Association Mapping of Root Traits in a Japonica Rice Panel
    Courtois, Brigitte
    Audebert, Alain
    Dardou, Audrey
    Roques, Sandrine
    Ghneim-Herrera, Thaura
    Droc, Gaetan
    Frouin, Julien
    Rouan, Lauriane
    Goze, Eric
    Kilian, Andrzej
    Ahmadi, Nourollah
    Dingkuhn, Michael
    PLOS ONE, 2013, 8 (11):
  • [42] Detecting short tandem repeats from genome data: opening the software black box
    Merkel, Angelika
    Gemmell, Neil
    BRIEFINGS IN BIOINFORMATICS, 2008, 9 (05) : 355 - 366
  • [43] Genome-wide analysis of simple sequence repeats in the model medicinal mushroom Ganoderma lucidum
    Qian, Jun
    Xu, Haibin
    Song, Jingyuan
    Xu, Jiang
    Zhu, Yingjie
    Chen, Shilin
    GENE, 2013, 512 (02) : 331 - 336
  • [44] Genome-Wide Analysis of Simple Sequence Repeats in Marine Animals-a Comparative Approach
    Jiang, Qun
    Li, Qi
    Yu, Hong
    Kong, Lingfeng
    MARINE BIOTECHNOLOGY, 2014, 16 (05) : 604 - 619
  • [45] Genome-wide assessment of genetic diversity, linkage disequilibrium and haplotype block structure in Tharparkar cattle breed of India
    Saravanan, K. A.
    Panigrahi, Manjit
    Kumar, Harshit
    Parida, Subhashree
    Bhushan, Bharat
    Gaur, G. K.
    Kumar, Pushpendra
    Dutt, Triveni
    Mishra, B. P.
    Singh, R. K.
    ANIMAL BIOTECHNOLOGY, 2022, 33 (02) : 297 - 311
  • [46] The genome-wide signature of short-term temporal selection
    Lynch, Michael
    Wei, Wen
    Ye, Zhiqiang
    Pfrender, Michael
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (28)
  • [47] Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication
    vonHoldt, Bridgett M.
    Pollinger, John P.
    Lohmueller, Kirk E.
    Han, Eunjung
    Parker, Heidi G.
    Quignon, Pascale
    Degenhardt, Jeremiah D.
    Boyko, Adam R.
    Earl, Dent A.
    Auton, Adam
    Reynolds, Andy
    Bryc, Kasia
    Brisbin, Abra
    Knowles, James C.
    Mosher, Dana S.
    Spady, Tyrone C.
    Elkahloun, Abdel
    Geffen, Eli
    Pilot, Malgorzata
    Jedrzejewski, Wlodzimierz
    Greco, Claudia
    Randi, Ettore
    Bannasch, Danika
    Wilton, Alan
    Shearman, Jeremy
    Musiani, Marco
    Cargill, Michelle
    Jones, Paul G.
    Qian, Zuwei
    Huang, Wei
    Ding, Zhao-Li
    Zhang, Ya-Ping
    Bustamante, Carlos D.
    Ostrander, Elaine A.
    Novembre, John
    Wayne, Robert K.
    NATURE, 2010, 464 (7290) : 898 - U109
  • [48] Haplotype-based genome-wide association studies for carcass and growth traits in chicken />
    Zhang, Hui
    Shen, Lin-Yong
    Xu, Zi-Chun
    Kramer, Luke M.
    Yu, Jia-Qiang
    Zhang, Xin-Yang
    Na, Wei
    Yang, Li-Li
    Cao, Zhi-Ping
    Luan, Peng
    Reecy, James M.
    Li, Hui
    POULTRY SCIENCE, 2020, 99 (05) : 2349 - 2361
  • [49] A genome-wide spectrum of tandem repeat expansions in 338,963 humans
    Cui, Ya
    Ye, Wenbin
    Li, Jason Sheng
    Li, Jingyi Jessica
    Vilain, Eric
    Sallam, Tamer
    Li, Wei
    CELL, 2024, 187 (09) : 2336 - 2341.e5
  • [50] Quality control, imputation and analysis of genome-wide genotyping data from the Illumina HumanCoreExome microarray
    Coleman, Jonathan R. I.
    Euesden, Jack
    Patel, Hamel
    Folarin, Amos A.
    Newhouse, Stephen
    Breen, Gerome
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2016, 15 (04) : 298 - 304