Gaussian process test for high-throughput sequencing time series: application to experimental evolution

被引:27
作者
Topa, Hande [1 ]
Jonas, Agnes [2 ,3 ]
Kofler, Robert [2 ]
Kosiol, Carolin [2 ]
Honkela, Antti [4 ]
机构
[1] Aalto Univ, HIIT, Dept Informat & Comp Sci, Espoo, Finland
[2] Vetmeduni Vienna, Inst Populat Genet, A-1210 Vienna, Austria
[3] Vienna Grad Sch Populat Genet, Vienna, Austria
[4] Univ Helsinki, Dept Comp Sci, HIIT, SF-00510 Helsinki, Finland
基金
芬兰科学院; 奥地利科学基金会;
关键词
GENE-EXPRESSION; POPULATIONS; TRAJECTORIES; ADAPTATION; DROSOPHILA; SELECTION;
D O I
10.1093/bioinformatics/btv014
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Recent advances in high-throughput sequencing (HTS) have made it possible to monitor genomes in great detail. New experiments not only use HTS to measure genomic features at one time point but also monitor them changing over time with the aim of identifying significant changes in their abundance. In population genetics, for example, allele frequencies are monitored over time to detect significant frequency changes that indicate selection pressures. Previous attempts at analyzing data from HTS experiments have been limited as they could not simultaneously include data at intermediate time points, replicate experiments and sources of uncertainty specific to HTS such as sequencing depth. Results: We present the beta-binomial Gaussian process model for ranking features with significant non-random variation in abundance over time. The features are assumed to represent proportions, such as proportion of an alternative allele in a population. We use the beta-binomial model to capture the uncertainty arising from finite sequencing depth and combine it with a Gaussian process model over the time series. In simulations that mimic the features of experimental evolution data, the proposed method clearly outperforms classical testing in average precision of finding selected alleles. We also present simulations exploring different experimental design choices and results on real data from Drosophila experimental evolution experiment in temperature adaptation.
引用
收藏
页码:1762 / 1770
页数:9
相关论文
共 33 条
  • [1] Agresti A., 2002, Categorical Data Analysis
  • [2] Sorad: a systems biology approach to predict and modulate dynamic signaling pathway response from phosphoproteome time-course measurements
    Aijo, Tarmo
    Granberg, Kirsi
    Lahdesmaki, Harri
    [J]. BIOINFORMATICS, 2013, 29 (10) : 1283 - 1291
  • [3] The Power to Detect Quantitative Trait Loci Using Resequenced, Experimentally Evolved Populations of Diploid, Sexual Organisms
    Baldwin-Brown, James G.
    Long, Anthony D.
    Thornton, Kevin R.
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2014, 31 (04) : 1040 - 1055
  • [4] Genome evolution and adaptation in a long-term experiment with Escherichia coli
    Barrick, Jeffrey E.
    Yu, Dong Su
    Yoon, Sung Ho
    Jeong, Haeyoung
    Oh, Tae Kwang
    Schneider, Dominique
    Lenski, Richard E.
    Kim, Jihyun F.
    [J]. NATURE, 2009, 461 (7268) : 1243 - U74
  • [5] Estimation of 2Nes from temporal allele frequency data
    Bollback, Jonathan P.
    York, Thomas L.
    Nielsen, Rasmus
    [J]. GENETICS, 2008, 179 (01) : 497 - 502
  • [6] What paths do advantageous alleles take during short-term evolutionary change?
    Burke, Molly K.
    Long, Anthony D.
    [J]. MOLECULAR ECOLOGY, 2012, 21 (20) : 4913 - 4916
  • [7] Genome-wide analysis of a long-term evolution experiment with Drosophila
    Burke, Molly K.
    Dunham, Joseph P.
    Shahrestani, Parvin
    Thornton, Kevin R.
    Rose, Michael R.
    Long, Anthony D.
    [J]. NATURE, 2010, 467 (7315) : 587 - U111
  • [8] Bayesian hierarchical clustering for microarray time series data with replicates and outlier measurements
    Cooke, Emma J.
    Savage, Richard S.
    Kirk, Paul D. W.
    Darkins, Robert
    Wild, David L.
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [9] Drosophila melanogaster recombination rate calculator
    Fiston-Lavier, Anna-Sophie
    Singh, Nadia D.
    Lipatov, Mikhail
    Petrov, Dmitri A.
    [J]. GENE, 2010, 463 (1-2) : 18 - 20
  • [10] Gaussian process modelling of latent chemical species: applications to inferring transcription factor activities
    Gao, Pei
    Honkela, Antti
    Rattray, Magnus
    Lawrence, Neil D.
    [J]. BIOINFORMATICS, 2008, 24 (16) : I70 - I75