Using Huffman Coding Method to Visualize and Analyze DNA Sequences

被引:18
作者
Qi, Zhao-Hui [1 ]
Li, Ling [2 ]
Qi, Xiao-Qin [1 ]
机构
[1] Shijiazhuang Tiedao Univ, Coll Informat Sci & Technol, Shijiazhuang 050043, Hebei, Peoples R China
[2] Zhejiang Shuren Univ, Basic Courses Dept, Hangzhou 310015, Zhejiang, Peoples R China
关键词
Huffman coding method; graphical representation; DNA sequence; sequence analysis; 2D GRAPHICAL REPRESENTATION; CHAOS-GAME REPRESENTATION; NUMERICAL CHARACTERIZATION; DUAL NUCLEOTIDES; H-CURVES; SIMILARITY/DISSIMILARITY; CLASSIFICATION; DESCRIPTORS; INVARIANTS; MATRIX;
D O I
10.1002/jcc.21906
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
On the basis of the Huffman coding method, we propose a new graphical representation of DNA sequence. The representation can avoid degeneracy and loss of information in the transfer of data from a DNA sequence to its graphical representation. Then a multicomponent vector from the representation is introduced to characterize quantitatively DNA sequences. The components of the vector are derived from the graphical representation of DNA primary sequence. The examination of similarities and dissimilarities among the complete coding sequences of beta-globin gene of 11 species and six ND6 proteins shows the utility of the scheme. (C) 2011 Wiley Periodicals, Inc. J Comput Chem 32: 3233-3240, 2011
引用
收藏
页码:3233 / 3240
页数:8
相关论文
共 36 条
  • [1] Distribution moments of 2D-graphs as descriptors of DNA sequences
    Bielinska-Waz, Dorota
    Nowak, Wieslaw
    Waz, Piotr
    Nandy, Ashesh
    Clark, Timothy
    [J]. CHEMICAL PHYSICS LETTERS, 2007, 443 (4-6) : 408 - 413
  • [2] 2D-dynamic representation of DNA sequences
    Bielinska-Waz, Dorota
    Clark, Timothy
    Waz, Piotr
    Nowak, Wieslaw
    Nandy, Ashesh
    [J]. CHEMICAL PHYSICS LETTERS, 2007, 442 (1-3) : 140 - 144
  • [3] Classification studies based on a spectral representation of DNA
    Bielinska-Waz, Dorota
    Subramaniam, Shankar
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2010, 266 (04) : 667 - 674
  • [4] Four-component spectral representation of DNA sequences
    Bielinska-Waz, Dorota
    [J]. JOURNAL OF MATHEMATICAL CHEMISTRY, 2010, 47 (01) : 41 - 51
  • [5] A group of 3D graphical representation of DNA sequences based on dual nucleotides
    Cao, Zhi
    Liao, Bo
    Li, Renfa
    [J]. INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY, 2008, 108 (09) : 1485 - 1490
  • [6] HAMORI E, 1989, BIOTECHNIQUES, V7, P710
  • [7] HAMORI E, 1983, J BIOL CHEM, V258, P1318
  • [8] A METHOD FOR THE CONSTRUCTION OF MINIMUM-REDUNDANCY CODES
    HUFFMAN, DA
    [J]. PROCEEDINGS OF THE INSTITUTE OF RADIO ENGINEERS, 1952, 40 (09): : 1098 - 1101
  • [9] Characterization of complex biological systems by matrix invariants
    Jaklic, Gasper
    Pisanski, Tomaz
    Randic, Milan
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2006, 13 (09) : 1558 - 1564
  • [10] CHAOS GAME REPRESENTATION OF GENE STRUCTURE
    JEFFREY, HJ
    [J]. NUCLEIC ACIDS RESEARCH, 1990, 18 (08) : 2163 - 2170