Applications of recursive segmentation to the analysis of DNA sequences

被引:69
作者
Li, WT
Bernaola-Galván, P
Haghighi, F
Grosse, I
机构
[1] N Shore LIJ Res Inst, Ctr Genom & Human Genet, Manhasset, NY 11030 USA
[2] Rockefeller Univ, Lab Stat Genet, New York, NY 10021 USA
[3] Univ Malaga, Dept Fis Aplicada 2, E-29071 Malaga, Spain
[4] Columbia Univ, Columbia Genome Ctr, New York, NY 10032 USA
[5] Cold Spring Harbor Lab, Cold Spring Harbor, NY 11724 USA
来源
COMPUTERS & CHEMISTRY | 2002年 / 26卷 / 05期
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
recursive segmentation; DNA sequence; dinucleotide;
D O I
10.1016/S0097-8485(02)00010-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Recursive segmentation is a procedure that partitions a DNA sequence into domains with a homogeneous composition of the four nucleotides A, C, G and T. This procedure can also be applied to any sequence converted from a DNA sequence, such as to a binary strong(G + C)/weak(A + T) sequence, to a binary sequence indicating the presence or absence of the dinucleotide CpG, or to a sequence indicating both the base and the codon position information. We apply various conversion schemes in order to address the following five DNA sequence analysis problems: isochore mapping, CpG island detection, locating the origin and terminus of replication in bacterial genomes, finding complex repeats in telomere sequences, and delineating coding and noncoding regions. We find that the recursive segmentation procedure can successfully detect isochore borders, CpG islands, and the origin and terminus of replication, but it needs improvement for detecting complex repeats as well as borders between coding and noncoding regions. (C) 2002 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:491 / 510
页数:20
相关论文
共 104 条
  • [91] ANALYSIS OF EUKARYOTIC GENOMES BY DENSITY GRADIENT CENTRIFUGATION
    THIERY, JP
    MACAYA, G
    BERNARDI, G
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1976, 108 (01) : 219 - 235
  • [92] Tiwari S, 1997, COMPUT APPL BIOSCI, V13, P263
  • [93] CG DINUCLEOTIDE CLUSTERS IN MHC GENES AND IN 5' DEMETHYLATED GENES
    TYKOCINSKI, ML
    MAX, EE
    [J]. NUCLEIC ACIDS RESEARCH, 1984, 12 (10) : 4385 - 4396
  • [94] Uberbacher EC, 1996, METHOD ENZYMOL, V266, P259
  • [95] The sequence of the human genome
    Venter, JC
    Adams, MD
    Myers, EW
    Li, PW
    Mural, RJ
    Sutton, GG
    Smith, HO
    Yandell, M
    Evans, CA
    Holt, RA
    Gocayne, JD
    Amanatides, P
    Ballew, RM
    Huson, DH
    Wortman, JR
    Zhang, Q
    Kodira, CD
    Zheng, XQH
    Chen, L
    Skupski, M
    Subramanian, G
    Thomas, PD
    Zhang, JH
    Miklos, GLG
    Nelson, C
    Broder, S
    Clark, AG
    Nadeau, C
    McKusick, VA
    Zinder, N
    Levine, AJ
    Roberts, RJ
    Simon, M
    Slayman, C
    Hunkapiller, M
    Bolanos, R
    Delcher, A
    Dew, I
    Fasulo, D
    Flanigan, M
    Florea, L
    Halpern, A
    Hannenhalli, S
    Kravitz, S
    Levy, S
    Mobarry, C
    Reinert, K
    Remington, K
    Abu-Threideh, J
    Beasley, E
    [J]. SCIENCE, 2001, 291 (5507) : 1304 - +
  • [96] The DNA structures at the ends of eukaryotic chromosomes
    Wellinger, RJ
    Sen, D
    [J]. EUROPEAN JOURNAL OF CANCER, 1997, 33 (05) : 735 - 749
  • [97] A new Fourier transform approach for protein coding measure based on the format of the Z curve
    Yan, M
    Lin, ZS
    Zhang, CT
    [J]. BIOINFORMATICS, 1998, 14 (08) : 685 - 690
  • [98] Zhang H., 1999, RECURSIVE PARTITIONI
  • [99] Identification of protein coding regions in the human genome by quadratic discriminant analysis
    Zhang, MQ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (02) : 565 - 568
  • [100] [No title captured]