Discovery of Spatially Cohesive Itemsets in Three-Dimensional Protein Structures

被引:4
作者
Zhou, Cheng [1 ]
Meysman, Pieter [1 ]
Cule, Boris [1 ]
Laukens, Kris [1 ]
Goethals, Bart [1 ]
机构
[1] Univ Antwerp, Dept Math & Comp Sci, B-2020 Antwerp, Belgium
关键词
Itemset mining; multidimensional data; cohesion; protein structure; RECOGNITION; BINDING; NETWORK; MOTIFS; MODEL;
D O I
10.1109/TCBB.2014.2311795
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In this paper we present a cohesive structural itemset miner aiming to discover interesting patterns in a set of data objects within a multidimensional spatial structure by combining the cohesion and the support of the pattern. We propose two ways to build the itemset miner, VertexOne and VertexAll, in an attempt to find a balance between accuracy and run-times. The experiments show that VertexOne performs better, and finds almost the same itemsets as VertexAll in a much shorter time. The usefulness of the method is demonstrated by applying it to find interesting patterns of amino acids in spatial proximity within a set of proteins based on their atomic coordinates in the protein molecular structure. Several patterns found by the cohesive structural itemset miner contain amino acids that frequently co-occur in the spatial structure, even if they are distant in the primary protein sequence and only brought together by protein folding. Further various indications were found that some of the discovered patterns seem to represent common underlying support structures within the proteins.
引用
收藏
页码:814 / 825
页数:12
相关论文
共 27 条
[1]  
Agrawal R., P 20 INT C VERY LARG
[2]   Data growth and its impact on the SCOP database: new developments [J].
Andreeva, Antonina ;
Howorth, Dave ;
Chandonia, John-Marc ;
Brenner, Steven E. ;
Hubbard, Tim J. P. ;
Chothia, Cyrus ;
Murzin, Alexey G. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D419-D425
[3]  
[Anonymous], 2 BIOKDD WORKSH DAT
[4]   The structure of PurR mutant L54M shows an alternative route to DNA kinking [J].
Arvidson, DN ;
Lu, F ;
Faber, C ;
Zalkin, H ;
Brennan, RG .
NATURE STRUCTURAL BIOLOGY, 1998, 5 (06) :436-441
[5]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[6]   Crystal structure of the λ repressor C-terminal domain provides a model for cooperative operator binding [J].
Bell, CE ;
Frescura, P ;
Hochschild, A ;
Lewis, M .
CELL, 2000, 101 (07) :801-811
[7]  
Cule B., 2009, Proceedings of the SIAM International Conference on Data Mining (SDM2009), P317
[8]   Smoothing 3D Protein Structure Motifs Through Graph Mining and Amino Acid Similarities [J].
Dhifli, Wajdi ;
Saidi, Rabie ;
Nguifo, Engelbert Mephu .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2014, 21 (02) :162-172
[9]   Winged helix proteins [J].
Gajiwala, KS ;
Burley, SK .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2000, 10 (01) :110-116
[10]  
Gärtner B, 1999, LECT NOTES COMPUT SC, V1643, P325