Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments

被引:62
作者
Chang, MSS
Benner, SA [1 ]
机构
[1] Univ Florida, Dept Chem, Gainesville, FL 32611 USA
[2] Fdn Appl Mol Evolut, Gainesville, FL 32601 USA
关键词
insertions; deletions; protein evolution; alignments; gaps;
D O I
10.1016/j.jmb.2004.05.045
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
To understand how protein segments are inserted and deleted during divergent evolution, a set of pairwise alignments contained exactly one gap, and therefore arising from the first insertion-deletion (indel) event in the time separating the homologs, was examined. The alignments showed that "structure breaking" amino acids (PGDNS) were preferred within and flanking gapped regions, as are two residues with hydrophilic side-chains (QE) that frequently occur at the surface of protein folds. Conversely, hydrophobic residues (FMILYVW) occur infrequently within and flanking the gapped region. These preferences are modestly different in protein pairs separated by an episode of adaptive evolution, than in pairs diverging under strong functional constraints. Surprisingly, regions near an indel have not evolved more rapidly than the sequence pair overall, showing no evidence that an indel event must be compensated by local amino acid replacement. The gap-lengths are best approximated by a Zipfian distribution, with the probability of a gap of length L decreasing as a function of L-1.8. These features are largely independent of the length of the gap and the extent of divergence (measured by both silent and non-silent sequence changes) separating the two proteins. Surprisingly, amino acid repeats were discovered in more than a third of the polypeptide segments in and around the gap. These correspond to repeats in the DNA sequence. This suggests that a signature of the mechanism by which indels occur in the DNA sequence remains in the encoded protein sequences. These data suggest specific tools to score gap placement in an alignment. They also suggest tools that distinguish true indels from gaps created by mistaken gene finding, including under-predicted and overpredicted introns. By providing mechanisms to identify errors, the tools will enhance the value of genome sequence databases in support of integrated paleogenomics strategies used to extract functional information in a post-genomic environment. (C) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:617 / 631
页数:15
相关论文
共 41 条
[1]   SPONTANEOUS TANDEM GENETIC DUPLICATIONS IN SALMONELLA-TYPHIMURIUM ARISE BY UNEQUAL RECOMBINATION BETWEEN RIBOSOMAL-RNA (RRN) CISTRONS [J].
ANDERSON, P ;
ROTH, J .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA-BIOLOGICAL SCIENCES, 1981, 78 (05) :3113-3117
[2]  
Ayala FJ, 1999, BIOESSAYS, V21, P71, DOI 10.1002/(SICI)1521-1878(199901)21:1<71::AID-BIES9>3.3.CO
[3]  
2-2
[4]   Functional inferences from reconstructed evolutionary biology involving rectified databases - an evolutionarily grounded approach to functional genomics [J].
Benner, SA ;
Chamberlin, SG ;
Liberles, DA ;
Govindarajan, S ;
Knecht, L .
RESEARCH IN MICROBIOLOGY, 2000, 151 (02) :97-106
[5]   EMPIRICAL AND STRUCTURAL MODELS FOR INSERTIONS AND DELETIONS IN THE DIVERGENT EVOLUTION OF PROTEINS [J].
BENNER, SA ;
COHEN, MA ;
GONNET, GH .
JOURNAL OF MOLECULAR BIOLOGY, 1993, 229 (04) :1065-1082
[6]   PATTERNS OF DIVERGENCE IN HOMOLOGOUS PROTEINS AS INDICATORS OF SECONDARY AND TERTIARY STRUCTURE - A PREDICTION OF THE STRUCTURE OF THE CATALYTIC DOMAIN OF PROTEIN-KINASES [J].
BENNER, SA ;
GERLOFF, D .
ADVANCES IN ENZYME REGULATION, 1991, 31 :121-181
[7]   Interpretive proteomics - finding biological meaning in genome and proteome databases [J].
Benner, SA .
ADVANCES IN ENZYME REGULATION, VOL 43, 2003, 43 :271-359
[8]   Evolution - Planetary biology - Paleontological, geological, and molecular histories of life [J].
Benner, SA ;
Caraco, MD ;
Thomson, JM ;
Gaucher, EA .
SCIENCE, 2002, 296 (5569) :864-868
[9]   DEVELOPMENT OF HYDROPHOBICITY PARAMETERS TO ANALYZE PROTEINS WHICH BEAR POSTTRANSLATIONAL OR COTRANSLATIONAL MODIFICATIONS [J].
BLACK, SD ;
MOULD, DR .
ANALYTICAL BIOCHEMISTRY, 1991, 193 (01) :72-82