On estimating evolutionary probabilities of population variants

被引:4
|
作者
Patel, Ravi [1 ,2 ]
Kumar, Sudhir [1 ,2 ,3 ]
机构
[1] Temple Univ, Inst Genom & Evolutionary Med, Philadelphia, PA 19122 USA
[2] Temple Univ, Dept Biol, Philadelphia, PA 19122 USA
[3] King Abdulaziz Univ, Ctr Excellence Genome Med & Res, Jeddah, Saudi Arabia
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Generalized method; Evolutionary probability; Forbidden alleles; Potential adaptation; DIVERGENCE TIMES; CONSERVATION; TIMETREES;
D O I
10.1186/s12862-019-1455-7
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
BackgroundThe evolutionary probability (EP) of an allele in a DNA or protein sequence predicts evolutionarily permissible (ePerm; EP0.05) and forbidden (eForb; EP<0.05) variants. EP of an allele represents an independent evolutionary expectation of observing an allele in a population based solely on the long-term substitution patterns captured in a multiple sequence alignment. In the neutral theory, EP and population frequencies can be compared to identify neutral and non-neutral alleles. This approach has been used to discover candidate adaptive polymorphisms in humans, which are eForbs segregating with high frequencies. The original method to compute EP requires the evolutionary relationships and divergence times of species in the sequence alignment (a timetree), which are not known with certainty for most datasets. This requirement impedes a general use of the original EP formulation. Here, we present an approach in which the phylogeny and times are inferred from the sequence alignment itself prior to the EP calculation. We evaluate if the modified EP approach produces results that are similar to those from the original method.ResultsWe compared EP estimates from the original and the modified approaches by using more than 18,000 protein sequence alignments containing orthologous sequences from 46 vertebrate species. For the original EP calculations, we used species relationships from UCSC and divergence times from TimeTree web resource, and the resulting EP estimates were considered to be the ground truth. We found that the modified approaches produced reasonable EP estimates for HGMD disease missense variant and 1000 Genomes Project missense variant datasets. Our results showed that reliable estimates of EP can be obtained without a priori knowledge of the sequence phylogeny and divergence times. We also found that, in order to obtain robust EP estimates, it is important to assemble a dataset with many sequences, sampling from a diversity of species groups.ConclusionWe conclude that the modified EP approach will be generally applicable for alignments and enable the detection of potentially neutral, deleterious, and adaptive alleles in populations.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Estimating Probabilities for Effective Data Fusion
    Lillis, David
    Zhang, Lusheng
    Toolan, Fergus
    Collier, Rem W.
    Leonard, David
    Dunnion, John
    SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 347 - 354
  • [32] ESTIMATING PROBABLE MAXIMUM FLOOD PROBABILITIES
    SHALABY, AI
    WATER RESOURCES BULLETIN, 1994, 30 (02): : 307 - 318
  • [33] ESTIMATING PROBABILITIES OF EXTREME RAINFALLS - CLOSURE
    FONTAINE, TA
    JOURNAL OF HYDRAULIC ENGINEERING-ASCE, 1991, 117 (08): : 1094 - 1094
  • [34] Estimating the contribution of genetic variants to difference in incidence of disease between population groups
    Ramal Moonesinghe
    John PA Ioannidis
    W Dana Flanders
    Quanhe Yang
    Benedict I Truman
    Muin J Khoury
    European Journal of Human Genetics, 2012, 20 : 831 - 836
  • [35] Guidance for estimating penetrance of monogenic disease-causing variants in population cohorts
    Wright, Caroline F.
    Sharp, Luke N.
    Jackson, Leigh
    Murray, Anna
    Ware, James S.
    MacArthur, Daniel G.
    Rehm, Heidi L.
    Patel, Kashyap A.
    Weedon, Michael N.
    NATURE GENETICS, 2024, 56 (09) : 1772 - 1779
  • [36] Martingales and fixation probabilities of evolutionary graphs
    Monk, T.
    Green, P.
    Paulin, M.
    PROCEEDINGS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2014, 470 (2165):
  • [37] Estimating probabilities for unbounded categorization problems
    Henderson, JB
    NEUROCOMPUTING, 2004, 57 : 77 - 86
  • [38] ESTIMATING TRANSMISSION PROBABILITIES FOR CHLAMYDIAL INFECTION
    KATZ, BP
    STATISTICS IN MEDICINE, 1992, 11 (05) : 565 - 577
  • [39] ESTIMATING PROBABILITIES FOR CALIFORNIA LEVEES - DISCUSSION
    VITA, CL
    JOURNAL OF GEOTECHNICAL ENGINEERING-ASCE, 1984, 110 (07): : 994 - 996
  • [40] Estimating probabilities from experimental frequencies
    Samengo, I
    PHYSICAL REVIEW E, 2002, 65 (04):