Boosting forward-time population genetic simulators through genotype compression

被引:5
作者
Ruths, Troy [1 ]
Nakhleh, Luay [1 ]
机构
[1] Rice Univ, Dept Comp Sci, Houston, TX 77251 USA
基金
美国国家科学基金会;
关键词
NETWORKS; BIOLOGY; MODELS;
D O I
10.1186/1471-2105-14-192
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Forward-time population genetic simulations play a central role in deriving and testing evolutionary hypotheses. Such simulations may be data-intensive, depending on the settings to the various parameters controlling them. In particular, for certain settings, the data footprint may quickly exceed the memory of a single compute node. Results: We develop a novel and general method for addressing the memory issue inherent in forward-time simulations by compressing and decompressing, in real-time, active and ancestral genotypes, while carefully accounting for the time overhead. We propose a general graph data structure for compressing the genotype space explored during a simulation run, along with efficient algorithms for constructing and updating compressed genotypes which support both mutation and recombination. We tested the performance of our method in very large-scale simulations. Results show that our method not only scales well, but that it also overcomes memory issues that would cripple existing tools. Conclusions: As evolutionary analyses are being increasingly performed on genomes, pathways, and networks, particularly in the era of systems biology, scaling population genetic simulators to handle large-scale simulations is crucial. We believe our method offers a significant step in that direction. Further, the techniques we provide are generic and can be integrated with existing population genetic simulators to boost their performance in terms of memory usage.
引用
收藏
页数:12
相关论文
共 18 条
[1]   Digital genetics: unravelling the genetic basis of evolution [J].
Adami, C .
NATURE REVIEWS GENETICS, 2006, 7 (02) :109-118
[2]   The challenges of informatics in synthetic biology: from biomolecular networks to artificial organisms [J].
Alterovitz, Gil ;
Muso, Taro ;
Ramoni, Marco F. .
BRIEFINGS IN BIOINFORMATICS, 2010, 11 (01) :80-95
[3]   GENOMEPOP:: A program to simulate genomes in populations [J].
Carvajal-Rodriguez, Antonio .
BMC BIOINFORMATICS, 2008, 9 (1)
[4]   Simulation of Genes and Genomes Forward in Time [J].
Carvajal-Rodriguez, Antonio .
CURRENT GENOMICS, 2010, 11 (01) :58-61
[5]   Textual data compression in computational biology: Algorithmic techniques [J].
Giancarlo, R. ;
Scaturro, D. ;
Utro, F. .
COMPUTER SCIENCE REVIEW, 2012, 6 (01) :1-25
[6]  
Griffiths R.C., 1997, Progress in Population Genetics and Human Evolution, V87, P257, DOI DOI 10.1007/978-1-4757-2609-1_16
[7]   Computer simulations: tools for population and evolutionary genetics [J].
Hoban, Sean ;
Bertorelle, Giorgio ;
Gaggiotti, Oscar E. .
NATURE REVIEWS GENETICS, 2012, 13 (02) :110-122
[8]   Sequence-level population simulations over large genomic regions [J].
Hoggart, Clive J. ;
Chadeau-Hyam, Marc ;
Clark, Taane G. ;
Lampariello, Riccardo ;
Whittaker, John C. ;
De Iorio, Maria ;
Balding, David J. .
GENETICS, 2007, 177 (03) :1725-1731
[9]   Directions in evolutionary biology [J].
Lewontin, RC .
ANNUAL REVIEW OF GENETICS, 2002, 36 :1-18
[10]   A framework for evolutionary systems biology [J].
Loewe, Laurence .
BMC SYSTEMS BIOLOGY, 2009, 3