New method for comparing DNA primary sequences based on a discrimination measure

被引:8
作者
Feng, Jie [1 ,2 ]
Hu, Yong [3 ]
Wan, Ping [3 ]
Zhang, Aibing [3 ]
Zhao, Weizhong [1 ,2 ]
机构
[1] Capital Normal Univ, Sch Math Sci, Beijing 100048, Peoples R China
[2] Capital Normal Univ, Inst Math & Interdisciplinary Sci, Beijing 100048, Peoples R China
[3] Capital Normal Univ, Coll Life Sci, Beijing 100048, Peoples R China
关键词
Pairwise distance; Similarity analysis; Phylogenetic tree; CHAOS GAME REPRESENTATION; 2-D GRAPHICAL REPRESENTATION; BURROWS-WHEELER TRANSFORM; BIOLOGICAL SEQUENCES; GENOMIC SIGNATURE; PROTEIN SEQUENCES; NUMERICAL CHARACTERIZATION; CORONAVIRUS PHYLOGENY; STATISTICAL MEASURES; ALIGNMENT;
D O I
10.1016/j.jtbi.2010.07.040
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We introduce a new approach to compare DNA primary sequences. The core of our method is a new measure of pairwise distances among sequences. Using the primitive discrimination substrings of sequence S and Q, a discrimination measure DM(S, Q) is defined for the similarity analysis of them. The proposed method does not require multiple alignments and is fully automatic. To illustrate its utility, we construct phylogenetic trees on two independent data sets. The results indicate that the method is efficient and powerful. (c) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:703 / 707
页数:5
相关论文
共 54 条
[1]   Computing distribution of scale independent motifs in biological sequences [J].
Almeida, Jonas S. ;
Vinga, Susana .
ALGORITHMS FOR MOLECULAR BIOLOGY, 2006, 1 (1)
[2]   Biological sequences as pictures - a generic two dimensional solution for iterated maps [J].
Almeida, Jonas S. ;
Vinga, Susana .
BMC BIOINFORMATICS, 2009, 10
[3]   Universal sequence map (USM) of arbitrary discrete sequences [J].
Almeida, JS ;
Vinga, S .
BMC BIOINFORMATICS, 2002, 3 (1)
[4]   Analysis of genomic sequences by Chaos Game Representation [J].
Almeida, JS ;
Carriço, JA ;
Maretzek, A ;
Noble, PA ;
Fletcher, M .
BIOINFORMATICS, 2001, 17 (05) :429-437
[7]   Conflict among individual mitochondrial proteins in resolving the phylogeny of eutherian orders [J].
Cao, Y ;
Janke, A ;
Waddell, PJ ;
Westerman, M ;
Takenaka, O ;
Murata, S ;
Okada, N ;
Pääbo, S ;
Hasegawa, M .
JOURNAL OF MOLECULAR EVOLUTION, 1998, 47 (03) :307-322
[8]   Exploration of phylogenetic data using a global sequence analysis method [J].
Chapus, C ;
Dufraigne, C ;
Edwards, S ;
Giron, A ;
Fertil, B ;
Deschavanne, P .
BMC EVOLUTIONARY BIOLOGY, 2005, 5 (1)
[9]   Shared information and program plagiarism detection [J].
Chen, X ;
Francia, B ;
Li, M ;
McKinnon, B ;
Seker, A .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2004, 50 (07) :1545-1551
[10]   Algorithmic clustering of music based on string compression [J].
Cilibrasi, R ;
Vitányi, P ;
de Wolf, R .
COMPUTER MUSIC JOURNAL, 2004, 28 (04) :49-67