Searching Protein 3-D Structures in Linear Time

被引:0
作者
Shibuya, Tetsuo [1 ]
机构
[1] Univ Tokyo, Inst Med Sci, Ctr Human Genome, Minato Ku, Tokyo 1088639, Japan
来源
RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, PROCEEDINGS | 2009年 / 5541卷
关键词
GEOMETRIC SUFFIX TREE; RELATE; 2; SETS; ALGORITHMS; ROTATION; VECTORS;
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Finding similar structures from 3-D structure databases of proteins is becoming more and more important issue in the post-genomic molecular biology. To compare 3-D structures of two molecules, biologists mostly use the RMSD (root mean square deviation) as the similarity measure. We propose new theoretically and practically fast algorithms for the fundamental problem of finding all the substructures of structures in a structure database of chain molecules (such is proteins), whose RMSDs to the query are within a given constant threshold. We first propose a breakthrough linear-expected-time algorithm for the problem, while the previous best-known time complexity was O(N log m), where N is the database size and in is the query size. For the expected time analysis, we propose to use the random-walk model (or the ideal chain model) as the model of average protein structures. We furthermore propose a series of preprocessing algorithms that enable faster queries. We checked the performance of our linear-expected-time algorithm through computational experiments over the whole PDB database. According to the experiments, our algorithm is 3.6 to 28 times faster than previously known algorithms for ordinary queries. Moreover, the experimental results support the validity of our theoretical analyses.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 19 条
[1]  
[Anonymous], 1996, MATRIX COMPUTATION
[2]  
[Anonymous], 1997, Foundations of Modern probability Theory
[3]   LEAST-SQUARES FITTING OF 2 3-D POINT SETS [J].
ARUN, KS ;
HUANG, TS ;
BLOSTEIN, SD .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1987, 9 (05) :699-700
[4]   Rapid retrieval of protein structures from databases [J].
Aung, Zeyar ;
Tan, Kian-Lee .
DRUG DISCOVERY TODAY, 2007, 12 (17-18) :732-739
[5]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[6]  
BOYD RH, 1996, STRUCTURE PROPERTIES
[7]   AN ALGORITHM FOR MACHINE CALCULATION OF COMPLEX FOURIER SERIES [J].
COOLEY, JW ;
TUKEY, JW .
MATHEMATICS OF COMPUTATION, 1965, 19 (90) :297-&
[8]  
De Gennes P.-G., 1979, SCALING CONCEPTS POL
[9]   Estimating 3-D rigid body transformations: A comparison of four major algorithms [J].
Eggert, DW ;
Lorusso, A ;
Fischer, RB .
MACHINE VISION AND APPLICATIONS, 1997, 9 (5-6) :272-290
[10]   Structure comparison and structure patterns [J].
Eidhammer, I ;
Jonassen, I ;
Taylor, WR .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (05) :685-716