Fast detection of common geometric substructure in proteins

被引:32
作者
Chew, LP
Huttenlocher, D
Kedem, K
Kleinberg, J [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
[2] Ben Gurion Univ Negev, Dept Math & Comp Sci, IL-84105 Beer Sheva, Israel
关键词
structure matching; unit-vector RMS; efficient algorithms;
D O I
10.1089/106652799318292
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We consider the problem of identifying common three-dimensional substructures between proteins. Our method is based on comparing the shape of the cc-carbon backbone structures of the proteins in order to find three-dimensional (3D) rigid motions that bring portions of the geometric structures into correspondence. We propose a geometric representation of protein backbone chains that is compact yet allows for similarity measures that are robust against noise and outliers, This representation encodes the structure of the backbone as a sequence of unit vectors, defined by each adjacent pair of a-carbons. We then define a measure of the similarity of two protein structures based on the root mean squared (RMS) distance between corresponding orientation vectors of the two proteins. Our measure has several advantages over measures that are commonly used for comparing protein shapes, such as the minimum RMS distance between the 3D positions of corresponding atoms in two proteins. A key advantage is that this new measure behaves well for identifying common substructures, in contrast with position-based measures where the nonmatching portions of the structure dominate the measure. At the same time, it avoids the quadratic space and computational difficulties associated with methods based on distance matrices and contact maps. We show applications of our approach to detecting common contiguous substructures in pairs of proteins, as well as the more difficult problem of identifying common protein domains (i.e., larger substructures that are not necessarily contiguous along the protein chain).
引用
收藏
页码:313 / 325
页数:13
相关论文
共 25 条
  • [1] PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES
    BERNSTEIN, FC
    KOETZLE, TF
    WILLIAMS, GJB
    MEYER, EF
    BRICE, MD
    RODGERS, JR
    KENNARD, O
    SHIMANOUCHI, T
    TASUMI, M
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) : 535 - 542
  • [2] A protein structure comparison methodology
    Brown, NP
    Orengo, CA
    Taylor, WR
    [J]. COMPUTERS & CHEMISTRY, 1996, 20 (03): : 359 - 380
  • [3] Creighton T., 1992, PROTEINS STRUCTURES
  • [4] FISCHER D, 1992, LECT NOTES COMPUTER, V644, P459
  • [5] Golub G.H., 2013, MATRIX COMPUTATIONS
  • [6] GRIENDLEY HM, 1993, J MOL BIOL, V229, P707
  • [7] Mapping the protein universe
    Holm, L
    Sander, C
    [J]. SCIENCE, 1996, 273 (5275) : 595 - 602
  • [8] PROTEIN-STRUCTURE COMPARISON BY ALIGNMENT OF DISTANCE MATRICES
    HOLM, L
    SANDER, C
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1993, 233 (01) : 123 - 138
  • [9] SEARCHING PROTEIN-STRUCTURE DATABASES HAS COME OF AGE
    HOLM, L
    SANDER, C
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1994, 19 (03): : 165 - 173
  • [10] KEDEM K, 1999, PROTEIN-STRUCT FUNCT, V38, P1