On estimating evolutionary probabilities of population variants

被引:4
|
作者
Patel, Ravi [1 ,2 ]
Kumar, Sudhir [1 ,2 ,3 ]
机构
[1] Temple Univ, Inst Genom & Evolutionary Med, Philadelphia, PA 19122 USA
[2] Temple Univ, Dept Biol, Philadelphia, PA 19122 USA
[3] King Abdulaziz Univ, Ctr Excellence Genome Med & Res, Jeddah, Saudi Arabia
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Generalized method; Evolutionary probability; Forbidden alleles; Potential adaptation; DIVERGENCE TIMES; CONSERVATION; TIMETREES;
D O I
10.1186/s12862-019-1455-7
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
BackgroundThe evolutionary probability (EP) of an allele in a DNA or protein sequence predicts evolutionarily permissible (ePerm; EP0.05) and forbidden (eForb; EP<0.05) variants. EP of an allele represents an independent evolutionary expectation of observing an allele in a population based solely on the long-term substitution patterns captured in a multiple sequence alignment. In the neutral theory, EP and population frequencies can be compared to identify neutral and non-neutral alleles. This approach has been used to discover candidate adaptive polymorphisms in humans, which are eForbs segregating with high frequencies. The original method to compute EP requires the evolutionary relationships and divergence times of species in the sequence alignment (a timetree), which are not known with certainty for most datasets. This requirement impedes a general use of the original EP formulation. Here, we present an approach in which the phylogeny and times are inferred from the sequence alignment itself prior to the EP calculation. We evaluate if the modified EP approach produces results that are similar to those from the original method.ResultsWe compared EP estimates from the original and the modified approaches by using more than 18,000 protein sequence alignments containing orthologous sequences from 46 vertebrate species. For the original EP calculations, we used species relationships from UCSC and divergence times from TimeTree web resource, and the resulting EP estimates were considered to be the ground truth. We found that the modified approaches produced reasonable EP estimates for HGMD disease missense variant and 1000 Genomes Project missense variant datasets. Our results showed that reliable estimates of EP can be obtained without a priori knowledge of the sequence phylogeny and divergence times. We also found that, in order to obtain robust EP estimates, it is important to assemble a dataset with many sequences, sampling from a diversity of species groups.ConclusionWe conclude that the modified EP approach will be generally applicable for alignments and enable the detection of potentially neutral, deleterious, and adaptive alleles in populations.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] ESTIMATING PROBABILITIES FOR NORMAL EXTREMES
    HALL, P
    ADVANCES IN APPLIED PROBABILITY, 1980, 12 (02) : 491 - 500
  • [22] Estimating probabilities in recommendation systems
    Sun, Mingxuan
    Lebanon, Guy
    Kidwell, Paul
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2012, 61 : 471 - 492
  • [23] A METHOD FOR ESTIMATING CONDITIONAL PROBABILITIES
    LUND, IA
    JOURNAL OF GEOPHYSICAL RESEARCH, 1960, 65 (06): : 1723 - 1729
  • [24] ESTIMATING NORMAL TAIL PROBABILITIES
    RUKHIN, AL
    NAVAL RESEARCH LOGISTICS, 1986, 33 (01) : 91 - 99
  • [25] NUMERICAL TECHNIQUES FOR ESTIMATING PROBABILITIES
    BOHNING, D
    HOFFMANN, KH
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 1982, 14 (3-4) : 283 - 293
  • [26] ESTIMATING QUANTILES OF BERNOULLI PROBABILITIES
    AMENT, RC
    BIOMETRICS, 1975, 31 (02) : 590 - 590
  • [27] Inferences from DNA data: population histories, evolutionary processes and forensic match probabilities
    Wilson, IJ
    Weale, ME
    Balding, DJ
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2003, 166 : 155 - 188
  • [28] Estimating individual contributions to population growth: evolutionary fitness in ecological time
    Coulson, T
    Benton, TG
    Lundberg, P
    Dall, SRX
    Kendall, BE
    Gaillard, JM
    PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2006, 273 (1586) : 547 - 555
  • [29] Inferences from DNA data: population histories, evolutionary processes and forensic match probabilities - Discussion
    Yang, ZH
    Stephens, D
    Dawson, KJ
    Drummond, A
    Nicholls, G
    Griffiths, RC
    Wilkinson-Herbots, HM
    Beaumont, MA
    Baird, SJE
    Lascoux, M
    Leblois, R
    Estoup, A
    Nielsen, R
    Hey, J
    Stumpf, MPH
    Wilkinson-Herbots, HM
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2003, 166 : 188 - 201
  • [30] Estimating the divergence of founder mutations in cancer susceptibility variants found in the Japanese population
    Nakamura, Wataru
    Hirata, Makoto
    Oda, Satoyo
    Sugawa, Masahiro
    Mateos, Raul N.
    Chiba, Kenichi
    Okada, Ai
    Sakamoto, Yoshitaka
    Sakamoto, Hiromi
    Shiraishi, Kouya
    Kohno, Takashi
    Yoshida, Teruhiko
    Shiraishi, Yuichi
    CANCER SCIENCE, 2025, 116 : 218 - 218