Structural assignments to the Mycoplasma genitalium proteins show extensive gene duplications and domain rearrangements

被引:132
作者
Teichmann, SA [1 ]
Park, J [1 ]
Chothia, C [1 ]
机构
[1] MRC, Mol Biol Lab, Cambridge CB2 2QH, England
关键词
D O I
10.1073/pnas.95.25.14658
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The parasitic bacterium Mycoplasma genitalium has a small, reduced genome with close to a basic set of genes. As a first step toward determining the families of protein domains that form the products of these genes, we have used the multiple sequence programs PSI-BLAST and GEANFAMMER to match the sequences of the 467 gene products of M. genitalium to the sequences of the domains that form proteins of known structure [Protein Data Bank (PDB) sequences]. PDB sequences (274) match all of 106 M. genitalium sequences and some parts of another 85; thus, 41% of its total sequences are matched in all or part. The evolutionary relationships of the PDB domains that match M. genitalium are described in the structural classification of proteins (SCOP) database. Using this information, we show that the domains in the matched M. genitalium sequences come from 114 superfamilies and that 58% of them have arisen by gene duplication. This level of duplication is more than twice that found by using pairwise sequence comparisons. The PDB domain matches also describe the domain structure of the matched sequences: just over a quarter contain one domain and the rest have combinations of two or more domains.
引用
收藏
页码:14658 / 14663
页数:6
相关论文
共 33 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES
    BERNSTEIN, FC
    KOETZLE, TF
    WILLIAMS, GJB
    MEYER, EF
    BRICE, MD
    RODGERS, JR
    KENNARD, O
    SHIMANOUCHI, T
    TASUMI, M
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) : 535 - 542
  • [3] Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships
    Brenner, SE
    Chothia, C
    Hubbard, TJP
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (11) : 6073 - 6078
  • [4] GENE DUPLICATIONS IN HAEMOPHILUS-INFLUENZAE
    BRENNER, SE
    HUBBARD, T
    MURZIN, A
    CHOTHIA, C
    [J]. NATURE, 1995, 378 (6553) : 140 - 140
  • [5] Brenner SE, 1996, METHOD ENZYMOL, V266, P635
  • [6] PROTEINS - 1000 FAMILIES FOR THE MOLECULAR BIOLOGIST
    CHOTHIA, C
    [J]. NATURE, 1992, 357 (6379) : 543 - 544
  • [7] Doolittle RF., 1987, Of Urfs and Orfs: A Primer on How to Analyze Derived Amino Acid Sequences
  • [8] Hidden Markov models
    Eddy, SR
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1996, 6 (03) : 361 - 365
  • [9] Assigning folds to the proteins encoded by the genome of Mycoplasma genitalium
    Fischer, D
    Eisenberg, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (22) : 11929 - 11934
  • [10] THE MINIMAL GENE COMPLEMENT OF MYCOPLASMA-GENITALIUM
    FRASER, CM
    GOCAYNE, JD
    WHITE, O
    ADAMS, MD
    CLAYTON, RA
    FLEISCHMANN, RD
    BULT, CJ
    KERLAVAGE, AR
    SUTTON, G
    KELLEY, JM
    FRITCHMAN, JL
    WEIDMAN, JF
    SMALL, KV
    SANDUSKY, M
    FUHRMANN, J
    NGUYEN, D
    UTTERBACK, TR
    SAUDEK, DM
    PHILLIPS, CA
    MERRICK, JM
    TOMB, JF
    DOUGHERTY, BA
    BOTT, KF
    HU, PC
    LUCIER, TS
    PETERSON, SN
    SMITH, HO
    HUTCHISON, CA
    VENTER, JC
    [J]. SCIENCE, 1995, 270 (5235) : 397 - 403