Ranking insertion, deletion and nonsense mutations based on their effect on genetic information

被引:20
作者
Zia, Amin [1 ]
Moses, Alan M. [1 ]
机构
[1] Univ Toronto, Dept Cell & Syst Biol, Toronto, ON M5S 3B2, Canada
来源
BMC BIOINFORMATICS | 2011年 / 12卷
关键词
HUMAN CANCER; FREQUENCY; GENOME; DISCOVERY; SELECTION; POLYMORPHISM; CARCINOMAS; SEQUENCES; DATABASE; MOTIFS;
D O I
10.1186/1471-2105-12-299
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Genetic variations contribute to normal phenotypic differences as well as diseases, and new sequencing technologies are greatly increasing the capacity to identify these variations. Given the large number of variations now being discovered, computational methods to prioritize the functional importance of genetic variations are of growing interest. Thus far, the focus of computational tools has been mainly on the prediction of the effects of amino acid changing single nucleotide polymorphisms (SNPs) and little attention has been paid to indels or nonsense SNPs that result in premature stop codons. Results: We propose computational methods to rank insertion-deletion mutations in the coding as well as non-coding regions and nonsense mutations. We rank these variations by measuring the extent of their effect on biological function, based on the assumption that evolutionary conservation reflects function. Using sequence data from budding yeast and human, we show that variations which that we predict to have larger effects segregate at significantly lower allele frequencies, and occur less frequently than expected by chance, indicating stronger purifying selection. Furthermore, we find that insertions, deletions and premature stop codons associated with disease in the human have significantly larger predicted effects than those not associated with disease. Interestingly, the large-effect mutations associated with disease show a similar distribution of predicted effects to that expected for completely random mutations. Conclusions: This demonstrates that the evolutionary conservation context of the sequences that harbour insertions, deletions and nonsense mutations can be used to predict and rank the effects of the mutations.
引用
收藏
页数:14
相关论文
共 58 条
  • [1] Genomics and the future of conservation genetics
    Allendorf, Fred W.
    Hohenlohe, Paul A.
    Luikart, Gordon
    [J]. NATURE REVIEWS GENETICS, 2010, 11 (10) : 697 - 709
  • [2] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [3] [Anonymous], IARC TP53 DAT
  • [4] [Anonymous], Cosmic
  • [5] [Anonymous], OMIM ONLINE MENDELIA
  • [6] [Anonymous], 1997, PRINCIPLES POPULATIO
  • [7] [Anonymous], Saccharomyces cerevisiae RM11-1a Sequencing Project
  • [8] The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
    Bairoch, A
    Apweiler, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 45 - 48
  • [9] Balz V, 2003, CANCER RES, V63, P1188
  • [10] Baroy Tuva, 2008, Tidsskr Nor Laegeforen, V128, P1951