Applications of recursive segmentation to the analysis of DNA sequences

被引:69
作者
Li, WT
Bernaola-Galván, P
Haghighi, F
Grosse, I
机构
[1] N Shore LIJ Res Inst, Ctr Genom & Human Genet, Manhasset, NY 11030 USA
[2] Rockefeller Univ, Lab Stat Genet, New York, NY 10021 USA
[3] Univ Malaga, Dept Fis Aplicada 2, E-29071 Malaga, Spain
[4] Columbia Univ, Columbia Genome Ctr, New York, NY 10032 USA
[5] Cold Spring Harbor Lab, Cold Spring Harbor, NY 11724 USA
来源
COMPUTERS & CHEMISTRY | 2002年 / 26卷 / 05期
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
recursive segmentation; DNA sequence; dinucleotide;
D O I
10.1016/S0097-8485(02)00010-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Recursive segmentation is a procedure that partitions a DNA sequence into domains with a homogeneous composition of the four nucleotides A, C, G and T. This procedure can also be applied to any sequence converted from a DNA sequence, such as to a binary strong(G + C)/weak(A + T) sequence, to a binary sequence indicating the presence or absence of the dinucleotide CpG, or to a sequence indicating both the base and the codon position information. We apply various conversion schemes in order to address the following five DNA sequence analysis problems: isochore mapping, CpG island detection, locating the origin and terminus of replication in bacterial genomes, finding complex repeats in telomere sequences, and delineating coding and noncoding regions. We find that the recursive segmentation procedure can successfully detect isochore borders, CpG islands, and the origin and terminus of replication, but it needs improvement for detecting complex repeats as well as borders between coding and noncoding regions. (C) 2002 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:491 / 510
页数:20
相关论文
共 104 条
  • [1] BAYESIAN-ANALYSIS OF MINIMUM AIC PROCEDURE
    AKAIKE, H
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 1978, 30 (01) : 9 - 14
  • [2] The genome sequence of Rickettsia prowazekii and the origin of mitochondria
    Andersson, SGE
    Zomorodipour, A
    Andersson, JO
    Sicheritz-Pontén, T
    Alsmark, UCM
    Podowski, RM
    Näslund, AK
    Eriksson, AS
    Winkler, HH
    Kurland, CG
    [J]. NATURE, 1998, 396 (6707) : 133 - 140
  • [3] [Anonymous], 1998, Science
  • [4] [Anonymous], GENETIC DATABASES
  • [5] Complete sequence and gene map of a human major histocompatibility complex
    Beck, S
    Geraghty, D
    Inoko, H
    Rowen, L
    Aguado, B
    Bahram, S
    Campbell, RD
    Forbes, SA
    Guillaudeux, T
    Hood, L
    Horton, R
    Janer, M
    Jasoni, C
    Madan, A
    Milne, S
    Neville, M
    Oka, A
    Qin, S
    Ribas-Despuig, G
    Rogers, J
    Shiina, T
    Spies, T
    Tamiya, G
    Tashiro, H
    Trowsdale, J
    Vu, Q
    Williams, L
    Yamazaki, M
    [J]. NATURE, 1999, 401 (6756) : 921 - 923
  • [6] Tandem repeats finder: a program to analyze DNA sequences
    Benson, G
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (02) : 573 - 580
  • [7] Finding borders between coding and noncoding DNA regions by an entropic segmentation method
    Bernaola-Galván, P
    Grosse, I
    Carpena, P
    Oliver, JL
    Román-Roldán, R
    Stanley, HE
    [J]. PHYSICAL REVIEW LETTERS, 2000, 85 (06) : 1342 - 1345
  • [8] Compositional segmentation and long-range fractal correlations in DNA sequences
    BernaolaGalvan, P
    RomanRoldan, R
    Oliver, JL
    [J]. PHYSICAL REVIEW E, 1996, 53 (05): : 5181 - 5189
  • [9] The human genome: Organization and evolutionary history
    Bernardi, G
    [J]. ANNUAL REVIEW OF GENETICS, 1995, 29 : 445 - 476
  • [10] THE ISOCHORE ORGANIZATION OF THE HUMAN GENOME
    BERNARDI, G
    [J]. ANNUAL REVIEW OF GENETICS, 1989, 23 : 637 - 661