Identification of related proteins with weak sequence identity using secondary structure information

被引:46
|
作者
Geourjon, C [1 ]
Combet, C [1 ]
Blanchet, C [1 ]
Deléage, G [1 ]
机构
[1] Inst Biol & Chim Prot, CNRS, UMR 5086, Pole Bioinformat Lyonnais, F-69367 Lyon 07, France
关键词
protein; molecular modeling; sequence; databank; alignment; structure prediction; secondary structure; Web server;
D O I
10.1110/ps.30001
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Molecular modeling of proteins is confronted with the problem of finding homologous proteins, especially when few identities remain after the process of molecular evolution. Using even the most recent methods based on sequence identity detection, structural relationships are still difficult to establish with high reliability. As protein structures are more conserved than sequences, we investigated the possibility of using protein secondary structure comparison (observed or predicted structures) to discriminate between related and unrelated proteins sequences in the range of 10%-30% sequence identity. Pairwise comparison of secondary structures have been measured using the structural overlap (Sov) parameter. In this article, we show that if the secondary structures likeness is >50%, most of the pairs are structurally related. Taking into account the secondary structures of proteins that have been detected by BLAST, FASTA, or SSEARCH in the noisy region (with high E value), we show that distantly related protein sequences (even with <20% identity) can be still identified. This strategy can be used to identify three-dimensional templates in homology modeling by finding unexpected related proteins and to select proteins for experimental investigation in a structural genomic approach, as well as for genome annotation.
引用
收藏
页码:788 / 797
页数:12
相关论文
共 50 条
  • [41] INFORMATION THEORETICAL APPROACH TO DETERMINATION OF SECONDARY STRUCTURE OF GLOBULAR PROTEINS
    HASETH, JAD
    ISENHOUR, TL
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1977, 173 (MAR20): : 27 - 27
  • [42] Modeling Proteins Using a Super-Secondary Structure Library and NMR Chemical Shift Information
    Menon, Vilas
    Vallat, Brinda K.
    Dybas, Joseph M.
    Fiser, Andras
    STRUCTURE, 2013, 21 (06) : 891 - 899
  • [43] Prediction of secondary structures of proteins using the sequence and spectroscopical data
    Schubach, BJ
    Rahmelow, K
    BIOCHIMICA ET BIOPHYSICA ACTA-PROTEIN STRUCTURE AND MOLECULAR ENZYMOLOGY, 1997, 1340 (01): : 72 - 80
  • [45] DETERMINATION OF SEQUENCE INFORMATION IN HOMOLOGOUSLY RELATED PROTEINS BY MASS-SPECTROMETRY
    DELL, A
    MORRIS, HR
    WILLIAMS, DH
    AMBLER, RP
    BIOMEDICAL MASS SPECTROMETRY, 1974, 1 (04): : 269 - 273
  • [46] A symmetry-related sequence-structure relation of proteins
    XU Ruizhen
    Chinese Science Bulletin, 2005, (06) : 536 - 538
  • [47] Structure-dependent sequence alignment for remotely related proteins
    Yang, AS
    BIOINFORMATICS, 2002, 18 (12) : 1658 - 1665
  • [48] SEQUENCE CONSERVATION AND SECONDARY STRUCTURE IDENTITY BETWEEN SOME NUCLEAR AND MITOCHONDRIAL INTRONS
    WARING, RB
    BROWN, TA
    DAVIES, RW
    SCAZZOCCHIO, C
    HEREDITY, 1983, 51 (OCT) : 519 - 520
  • [49] A symmetry-related sequence-structure relation of proteins
    Xu, RZ
    Li, MF
    Chen, HL
    Huang, YZ
    Yi, X
    CHINESE SCIENCE BULLETIN, 2005, 50 (06): : 536 - 538
  • [50] A homology identification method that combines protein sequence and structure information
    Yu, LH
    White, JV
    Smith, TF
    PROTEIN SCIENCE, 1998, 7 (12) : 2499 - 2510