FINDING PROTEIN FAMILY SIMILARITIES IN REAL TIME THROUGH MULTIPLE 3D AND 2D REPRESENTATIONS, INDEXING AND EXHAUSTIVE SEARCHING

被引:0
作者
Paquet, Eric [1 ]
Viktor, Herna Lydia [2 ]
机构
[1] Natl Res Council Canada, Inst Informat Technol, Ottawa, ON K1A 0R6, Canada
[2] Univ Ottawa, Sch IT & Engn, Ottawa, ON K1N 6N5, Canada
来源
KDIR 2009: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL | 2009年
关键词
Indexing; Proteins; Representation; Searching; 3D; RETRIEVAL-SYSTEM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research suggests that the complex geometric shapes of amino-acid sequence folds often determine their functions. In order to aid domain experts to classify new protein structures, and to be able to identify the functions of such new discoveries, accurate shape-related algorithms for locating similar protein structures are thus needed. To this end, we present our Content-based Analysis of Protein Structure for Retrieval and Indexing system, which locates protein families, and identifies similarities between families, based on the 2D and 3D signatures of protein structures. Our approach is novel in that we utilize five different representations, using a query by prototype approach. These diverse representations provide us with the ability to view a particular protein structure, and the family it belongs to, focusing on (1) the C-a chain, (2) the atomic position, (3) the secondary structure, based on (4) residue type or (5) residue name. Our experimental results indicate that our method is able to accurately locate protein families, when evaluated against the 53.000 entries located within the Protein Data Bank performing an exhaustive search in less than a fraction of a second.
引用
收藏
页码:127 / +
页数:2
相关论文
共 19 条
[1]   Shape modeling and matching in identifying 3D protein structures [J].
Abeysinghe, Sasakthi ;
Ju, Tao ;
Baker, Matthew L. ;
Chiu, Wah .
COMPUTER-AIDED DESIGN, 2008, 40 (06) :708-720
[2]   Exploiting geometrical properties on protein similarity search [J].
Akbar, Saiful ;
Kueng, Josef ;
Wagner, Roland .
SEVENTEENTH INTERNATIONAL CONFERENCE ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2006, :228-+
[3]   Data growth and its impact on the SCOP database: new developments [J].
Andreeva, Antonina ;
Howorth, Dave ;
Chandonia, John-Marc ;
Brenner, Steven E. ;
Hubbard, Tim J. P. ;
Chothia, Cyrus ;
Murzin, Alexey G. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D419-D425
[4]  
Berman H. M., 2008, PROTEIN DATA BANK
[5]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[6]   A fast protein structure retrieval system using image-based distance matrices and multidimensional index [J].
Chi, PH ;
Scott, G ;
Shyu, CR .
BIBE 2004: FOURTH IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2004, :522-529
[7]  
Cui CY, 2008, INT CONF BIOMED, P98, DOI [10.1109/BMEI.2008.21, 10.1109/MEI.2008.21]
[8]  
Cui CY, 2004, P SOC PHOTO-OPT INS, V5307, P543
[9]   Three-dimensional shape-structure comparison method for protein classification [J].
Daras, Petros ;
Zarpalas, Dimitrios ;
Axenopoulos, Apostolos ;
Tzovaras, Dimitrios ;
Strintzis, Michael Gerassimos .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2006, 3 (03) :193-207
[10]  
Huang Z, 2006, LECT NOTES COMPUT SC, V4080, P528