Similarity analysis of DNA sequences based on the relative entropy

被引:0
作者
Yang, WL
Pi, XJ
Zhang, LQ
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200030, Peoples R China
[2] Shanghai Inst Biol Sci, Shanghai Inst Syst Biol, Shanghai, Peoples R China
[3] Shanghai Maritime Univ, Dept Elect Engn, Shanghai 200135, Peoples R China
来源
ADVANCES IN NATURAL COMPUTATION, PT 1, PROCEEDINGS | 2005年 / 3610卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates the similarity of two sequences, one of the main issues for fragments clustering and classification when sequencing the genomes of microbial communities directly sampled from natural environment. In this paper, we use the relative entropy as a criterion of similarity of two sequences and discuss its characteristics in DNA sequences. A method for evaluating the relative entropy is presented and applied to the comparison between two sequences. With combination of the relative entropy and the length of variables defined in this paper, the similarity of sequences is easily obtained. The SOM and PCA are applied to cluster subsequences from different genomes. Computer simulations verify that the method works well.
引用
收藏
页码:1035 / 1038
页数:4
相关论文
共 6 条
  • [1] Words in DNA sequences: some case studies based on their frequency statistics
    Basu, S
    Burma, DP
    Chaudhuri, P
    [J]. JOURNAL OF MATHEMATICAL BIOLOGY, 2003, 46 (06) : 479 - 503
  • [2] The mutual information: Detecting and evaluating dependencies between variables
    Steuer, R
    Kurths, J
    Daub, CO
    Weise, J
    Selbig, J
    [J]. BIOINFORMATICS, 2002, 18 : S231 - S240
  • [3] STRICKERT M, 2004, SELF ORGANIZING NEUR, V7, P68
  • [4] THOMAS MC, 2001, ELEMENTS INFORMATION
  • [5] Community structure and metabolism through reconstruction of microbial genomes from the environment
    Tyson, GW
    Chapman, J
    Hugenholtz, P
    Allen, EE
    Ram, RJ
    Richardson, PM
    Solovyev, VV
    Rubin, EM
    Rokhsar, DS
    Banfield, JF
    [J]. NATURE, 2004, 428 (6978) : 37 - 43
  • [6] Alignment-free sequence comparison - a review
    Vinga, S
    Almeida, J
    [J]. BIOINFORMATICS, 2003, 19 (04) : 513 - 523