genome evolution;
paralogous genes;
gene family size distribution;
Markov chain;
D O I:
10.1142/S0218202507002169
中图分类号:
O29 [应用数学];
学科分类号:
070104 ;
摘要:
We introduce and analyze a simple probabilistic model of genome evolution. It is based on three fundamental evolutionary events: gene loss, duplication and accumulated change. We are mainly interested in asymptotic size distribution of small paralogous gene families in a genome. This is motivated by previous works which consisted in fitting the available genomic data into, what is called, paralog distributions. This formalism is described as a discrete-time Markov chain. The formulas for equilibrium paralog family sizes are derived. Moreover, we show that when probabilities of gene removal and duplication are small and close to each other, then the resulting distribution is close to logarithmic distribution. Some empirical results for microbial genomes are presented.