Recognition of an organism from fragments of its complete genome

被引:35
作者
Anh, V.V. [1 ]
Lau, K.S. [2 ]
Yu, Z.G. [1 ,3 ]
机构
[1] Ctr. in Stat. Sci. and Indust. Math., Queensland University of Technology, P. O. Box 2434, Brisbane Q4001, Australia
[2] Department of Mathematics, Chinese University of Hong Kong, Shatin, Hong Kong
[3] Department of Mathematics, Xiangtan University, Hunan 411105, China
来源
Physical Review E - Statistical, Nonlinear, and Soft Matter Physics | 2002年 / 66卷 / 03期
关键词
Bacteria - Correlation methods - Database systems - DNA - Functions - Graphic methods - Iterative methods - Probability density function - Random processes;
D O I
10.1103/PhysRevE.66.031910
中图分类号
学科分类号
摘要
This paper considers the problem of matching a fragment to an organism using its complete genome. Our method is based on the probability measure representation of a genome. We first demonstrate that these probability measures can be modeled as recurrent iterated function systems (RIFS) consisting of four contractive similarities. Our hypothesis is that the multifractal characteristics of the probability measure of a complete genome, as captured by the RIFS, is preserved in its reasonably long fragments. We compute the RIFS of fragments of various lengths and random starting points, and compare with that of the original sequence for recognition using the Euclidean distance. A demonstration on five randomly selected organisms supports the above hypothesis. © 2002 The American Physical Society.
引用
收藏
页码:1 / 031910
相关论文
empty
未找到相关数据