DETECTING FRAME SHIFTS BY AMINO-ACID-SEQUENCE COMPARISON

被引:25
作者
CLAVERIE, JM
机构
[1] National Center for Biotechnology Information, National Library of Medicine, National institutes of Health, Bethesda
关键词
AMINO ACID SEQUENCE; COMPUTER ANALYSIS; FRAME SHIFT; EVOLUTION; ERROR;
D O I
10.1006/jmbi.1993.1666
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Various amino acid substitution scoring matrices are used in conjunction with local alignments programs to detect regions of similarity and infer potential common ancestry between proteins. The usual scoring schemes derive from the implicit hypothesis that related proteins evolve from a common ancestor by the accumulation of point mutations and that amino acids tend to be progressively substituted by others with similar properties. However, other frequent single mutation events, like nucleotide insertion or deletion and gene inversion, change the translation reading frame and cause previously encoded amino acid sequences to become unrecognizable at once. Here, I derive five new types of scoring matrix, each capable of detecting a specific frame shift (deletion, insertion and inversion in 3 frames) and use them with a regular local alignments program to detect amino acid sequences that may have derived from alternative reading frames of the same nucleotide sequence. Frame shifts are inferred from the sole comparison of the protein sequences. The five scoring matrices were used with the BLASTP program to compare all the protein sequences in the Swissprot database. Surprisingly, the searches revealed hundreds of highly significant frame shift matches, of which many are likely to represent sequencing errors. Others provide some evidence that frame shift mutations might be used in protein evolution as a way to create new amino acid sequences from pre-existing coding regions. © 1993 Academic Press Limited.
引用
收藏
页码:1140 / 1157
页数:18
相关论文
共 33 条
[1]   IDENTIFICATION, MOLECULAR-CLONING AND SEQUENCE-ANALYSIS OF A GENE-CLUSTER ENCODING THE CLASS-II FRUCTOSE-1,6-BISPHOSPHATE ALDOLASE, 3-PHOSPHOGLYCERATE KINASE AND A PUTATIVE 2ND GLYCERALDEHYDE-3-PHOSPHATE DEHYDROGENASE OF ESCHERICHIA-COLI [J].
ALEFOUNDER, PR ;
PERHAM, RN .
MOLECULAR MICROBIOLOGY, 1989, 3 (06) :723-732
[2]   AMINO-ACID SUBSTITUTION MATRICES FROM AN INFORMATION THEORETIC PERSPECTIVE [J].
ALTSCHUL, SF .
JOURNAL OF MOLECULAR BIOLOGY, 1991, 219 (03) :555-565
[3]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[4]   THE SWISS-PROT PROTEIN-SEQUENCE DATA-BANK [J].
BAIROCH, A ;
BOECKMANN, B .
NUCLEIC ACIDS RESEARCH, 1992, 20 :2019-2022
[5]  
BREGMAN DB, 1989, J BIOL CHEM, V264, P4648
[6]   CHARACTERIZATION AND CHROMOSOMAL ASSIGNMENT OF A HUMAN CDNA-ENCODING A PROTEIN RELATED TO THE MURINE 102-KDA CADHERIN-ASSOCIATED PROTEIN (ALPHA-CATENIN) [J].
CLAVERIE, JM ;
HARDELIN, JP ;
LEGOUIS, R ;
LEVILLIERS, J ;
BOUGUELERET, L ;
MATTEI, MG ;
PETIT, C .
GENOMICS, 1993, 15 (01) :13-20
[7]  
CLAVERIE JM, 1993, NATURE, V364, P19
[8]   INFORMATION ENHANCEMENT METHODS FOR LARGE-SCALE SEQUENCE-ANALYSIS [J].
CLAVERIE, JM ;
STATES, DJ .
COMPUTERS & CHEMISTRY, 1993, 17 (02) :191-201
[9]   ISOLATION, CLONING, AND SEQUENCING OF THE SALMONELLA-TYPHIMURIUM DDLA GENE WITH PURIFICATION AND CHARACTERIZATION OF ITS PRODUCT, D-ALANINE - D-ALANINE LIGASE (ADP FORMING) [J].
DAUB, E ;
ZAWADZKE, LE ;
BOTSTEIN, D ;
WALSH, CT .
BIOCHEMISTRY, 1988, 27 (10) :3701-3708
[10]  
Dayhoff MO., 1978, ATLAS PROTEIN SEQ ST, V5, P345