Using variant databases for variant prioritization and to detect erroneous genotype-phenotype associations

被引:9
作者
Broeckx, Bart J. G. [1 ]
Peelman, Luc [1 ]
Saunders, Jimmy H. [2 ]
Deforce, Dieter [3 ]
Clement, Lieven [4 ]
机构
[1] Univ Ghent, Lab Anim Genet, Fac Vet Med, Heidestr 19, B-9820 Merelbeke, Belgium
[2] Univ Ghent, Dept Med Imaging & Orthoped, Fac Vet Med, Merelbeke, Belgium
[3] Univ Ghent, Fac Pharmaceut Sci, Lab Pharmaceut Biotechnol, Ghent, Belgium
[4] Univ Ghent, Dept Appl Math Comp Sci & Stat, Fac Sci, Ghent, Belgium
来源
BMC BIOINFORMATICS | 2017年 / 18卷
关键词
1000 Genomes project variant database; Allele frequency; dbSNP; HapMap; Variant filtering; Variant database; BIOTINIDASE DEFICIENCY; GENETIC-VARIATION; CEREBROTENDINOUS XANTHOMATOSIS; ACROFACIAL DYSOSTOSIS; MUTATION DATABASES; BARTTERS-SYNDROME; MILLER SYNDROME; DISEASE; COMMON; TOOL;
D O I
10.1186/s12859-017-1951-y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: In the search for novel causal mutations, public and/or private variant databases are nearly always used to facilitate the search as they result in a massive reduction of putative variants in one step. Practically, variant filtering is often done by either using all variants from the variant database (called the absence-approach, i.e. it is assumed that disease-causing variants do not reside in variant databases) or by using the subset of variants with an allelic frequency > 1% (called the 1%-approach). We investigate the validity of these two approaches in terms of false negatives (the true disease-causing variant does not pass all filters) and false positives (a harmless mutation passes all filters and is erroneously retained in the list of putative disease-causing variants) and compare it with an novel approach which we named the quantile-based approach. This approach applies variable instead of static frequency thresholds and the calculation of these thresholds is based on prior knowledge of disease prevalence, inheritance models, database size and database characteristics. Results: Based on real-life data, we demonstrate that the quantile-based approach outperforms the absence-approach in terms of false negatives. At the same time, this quantile-based approach deals more appropriately with the variable allele frequencies of disease-causing alleles in variant databases relative to the 1%-approach and as such allows a better control of the number of false positives. We also introduce an alternative application for variant database usage and the quantile-based approach. If disease-causing variants in variant databases deviate substantially from theoretical expectancies calculated with the quantile-based approach, their association between genotype and phenotype had to be reconsidered in 12 out of 13 cases. Conclusions: We developed a novel method and demonstrated that this so-called quantile-based approach is a highly suitable method for variant filtering. In addition, the quantile-based approach can also be used for variant flagging. For user friendliness, lookup tables and easy-to-use R calculators are provided.
引用
收藏
页数:10
相关论文
共 34 条
  • [1] A map of human genome variation from population-scale sequencing
    Altshuler, David
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Collins, Francis S.
    De la Vega, Francisco M.
    Donnelly, Peter
    Egholm, Michael
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Knoppers, Bartha M.
    Lander, Eric S.
    Lehrach, Hans
    Mardis, Elaine R.
    McVean, Gil A.
    Nickerson, DebbieA.
    Peltonen, Leena
    Schafer, Alan J.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Deiros, David
    Metzker, Mike
    Muzny, Donna
    Reid, Jeff
    Wheeler, David
    Wang, Jun
    Li, Jingxiang
    Jian, Min
    Li, Guoqing
    Li, Ruiqiang
    Liang, Huiqing
    Tian, Geng
    Wang, Bo
    Wang, Jian
    Wang, Wei
    Yang, Huanming
    Zhang, Xiuqing
    Zheng, Huisong
    Lander, Eric S.
    Altshuler, David L.
    Ambrogio, Lauren
    Bloom, Toby
    Cibulskis, Kristian
    Fennell, Tim J.
    Gabriel, Stacey B.
    [J]. NATURE, 2010, 467 (7319) : 1061 - 1073
  • [2] Integrating common and rare genetic variation in diverse human populations
    Altshuler, David M.
    Gibbs, Richard A.
    Peltonen, Leena
    Dermitzakis, Emmanouil
    Schaffner, Stephen F.
    Yu, Fuli
    Bonnen, Penelope E.
    de Bakker, Paul I. W.
    Deloukas, Panos
    Gabriel, Stacey B.
    Gwilliam, Rhian
    Hunt, Sarah
    Inouye, Michael
    Jia, Xiaoming
    Palotie, Aarno
    Parkin, Melissa
    Whittaker, Pamela
    Chang, Kyle
    Hawes, Alicia
    Lewis, Lora R.
    Ren, Yanru
    Wheeler, David
    Muzny, Donna Marie
    Barnes, Chris
    Darvishi, Katayoon
    Hurles, Matthew
    Korn, Joshua M.
    Kristiansson, Kati
    Lee, Charles
    McCarroll, Steven A.
    Nemesh, James
    Keinan, Alon
    Montgomery, Stephen B.
    Pollack, Samuela
    Price, Alkes L.
    Soranzo, Nicole
    Gonzaga-Jauregui, Claudia
    Anttila, Verneri
    Brodeur, Wendy
    Daly, Mark J.
    Leslie, Stephen
    McVean, Gil
    Moutsianas, Loukas
    Nguyen, Huy
    Zhang, Qingrun
    Ghori, Mohammed J. R.
    McGinnis, Ralph
    McLaren, William
    Takeuchi, Fumihiko
    Grossman, Sharon R.
    [J]. NATURE, 2010, 467 (7311) : 52 - 58
  • [3] Apparent underdiagnosis of Cerebrotendinous Xanthomatosis revealed by analysis of ∼60,000 human exomes
    Appadurai, Vivek
    DeBarber, Andrea
    Chiang, Pei-Wen
    Patel, Shailendra B.
    Steiner, Robert D.
    Tyler, Charles
    Bonnen, Penelope E.
    [J]. MOLECULAR GENETICS AND METABOLISM, 2015, 116 (04) : 298 - 304
  • [4] Exome sequencing as a tool for Mendelian disease gene discovery
    Bamshad, Michael J.
    Ng, Sarah B.
    Bigham, Abigail W.
    Tabor, Holly K.
    Emond, Mary J.
    Nickerson, Deborah A.
    Shendure, Jay
    [J]. NATURE REVIEWS GENETICS, 2011, 12 (11) : 745 - 755
  • [5] In cis autosomal dominant mutation of Senataxin associated with tremor/ataxia syndrome
    Bassuk, A. G.
    Chen, Y. Z.
    Batish, S. D.
    Nagan, N.
    Opal, P.
    Chance, P. F.
    Bennett, C. L.
    [J]. NEUROGENETICS, 2007, 8 (01) : 45 - 49
  • [6] Genetic interaction of BBS1 mutations with alleles at other BBS loci can result in non-Mendelian Bardet-Biedl syndrome
    Beales, PL
    Badano, JL
    Ross, AJ
    Ansley, SJ
    Hoskins, BE
    Kirsten, B
    Mein, CA
    Froguel, P
    Scambler, PJ
    Lewis, RA
    Lupski, JR
    Katsanis, N
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2003, 72 (05) : 1187 - 1199
  • [7] Common and rare variants in multifactorial susceptibility to common diseases
    Bodmer, Walter
    Bonilla, Carolina
    [J]. NATURE GENETICS, 2008, 40 (06) : 695 - 701
  • [8] Biotinidase deficiency: clinical and genetic studies of 38 Brazilian patients
    Borsatto, Taciane
    Sperb-Ludwig, Fernanda
    Pinto, Louise L. C.
    De Luca, Gisele R.
    Carvalho, Francisca L.
    De Souza, Carolina F. M.
    De Medeiros, Paula F. V.
    Lourenco, Charles M.
    Filho, Reinaldo L. O.
    Neto, Eurico C.
    Bernardi, Pricila
    Leistner-Segal, Sandra
    Schwartz, Ida V. D.
    [J]. BMC MEDICAL GENETICS, 2014, 15
  • [9] Toward the most ideal case-control design with related and unrelated dogs in whole-exome sequencing studies
    Broeckx, B. J. G.
    Coopman, F.
    Verhoeven, G. E. C.
    De Keulenaer, S.
    De Meester, E.
    Bavegems, V.
    Smets, P.
    Van Ryssen, B.
    Van Nieuwerburgh, F.
    Deforce, D.
    [J]. ANIMAL GENETICS, 2016, 47 (02) : 200 - 207
  • [10] An heuristic filtering tool to identify phenotype-associated genetic variants applied to human intellectual disability and canine coat colors
    Broeckx, Bart J. G.
    Coopman, Frank
    Verhoeven, Geert
    Bosmans, Tim
    Gielen, Ingrid
    Dingemanse, Walter
    Saunders, Jimmy H.
    Deforce, Dieter
    Van Nieuwerburgh, Filip
    [J]. BMC BIOINFORMATICS, 2015, 16