SLIDER: A Generic Metaheuristic for the Discovery of Correlated Motifs in Protein-Protein Interaction Networks

被引:6
作者
Boyen, Peter [1 ,2 ]
Van Dyck, Dries [1 ,2 ]
Neven, Frank [1 ,2 ]
van Ham, Roeland C. H. J.
van Dijk, Aalt D. J. [3 ]
机构
[1] Hasselt Univ, B-3590 Diepenbeek, Belgium
[2] Transnatl Univ Limburg, B-3590 Diepenbeek, Belgium
[3] Wageningen Univ & Res Ctr, Bioinformat Grp, NL-6708 PB Wageningen, Netherlands
关键词
Graphs and networks; biology and genetics; PAIRS;
D O I
10.1109/TCBB.2011.17
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Correlated motif mining (CMM) is the problem of finding overrepresented pairs of patterns, called motifs, in sequences of interacting proteins. Algorithmic solutions for CMM thereby provide a computational method for predicting binding sites for protein interaction. In this paper, we adopt a motif-driven approach where the support of candidate motif pairs is evaluated in the network. We experimentally establish the superiority of the Chi-square-based support measure over other support measures. Furthermore, we obtain that CMM is an NP-hard problem for a large class of support measures ( including Chi-square) and reformulate the search for correlated motifs as a combinatorial optimization problem. We then present the generic metaheuristic SLIDER which uses steepest ascent with a neighborhood function based on sliding motifs and employs the Chi-square-based support measure. We show that SLIDER outperforms existing motif-driven CMM methods and scales to large protein-protein interaction networks. The SLIDER-implementation and the data used in the experiments are available on http://bioinformatics.uhasselt.be.
引用
收藏
页码:1344 / 1357
页数:14
相关论文
共 20 条
[1]  
AARTS E.H.L., 1997, LOCAL SEARCH COMBINA
[2]   Ten thousand interactions for the molecular biologist [J].
Aloy, P ;
Russell, RB .
NATURE BIOTECHNOLOGY, 2004, 22 (10) :1317-1321
[3]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theory of NP-Completeness
[4]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[5]   Metaheuristics in combinatorial optimization: Overview and conceptual comparison [J].
Blum, C ;
Roli, A .
ACM COMPUTING SURVEYS, 2003, 35 (03) :268-308
[6]   SLIDER: Mining correlated motifs in protein-protein interaction networks [J].
Boyen, Peter ;
Neven, Frank ;
Van Dyck, Dries ;
van Dijk, Aalt D. J. ;
van Ham, Roeland C. H. J. .
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, :716-+
[7]   Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae [J].
Collins, Sean R. ;
Kemmeren, Patrick ;
Zhao, Xue-Chu ;
Greenblatt, Jack F. ;
Spencer, Forrest ;
Holstege, Frank C. P. ;
Weissman, Jonathan S. ;
Krogan, Nevan J. .
MOLECULAR & CELLULAR PROTEOMICS, 2007, 6 (03) :439-450
[8]   PRISM: A prime-encoding approach for frequent sequence mining [J].
Gouda, Karam ;
Hassaan, Mosab ;
Zaki, Mohammed J. .
ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, :487-+
[9]  
Hubbard S., 1993, Computer Program, Department of Biochemistry Molecular Biology
[10]   GTOP: a database of protein structures predicted from genome sequences [J].
Kawabata, T ;
Fukuchi, S ;
Homma, K ;
Ota, M ;
Araki, J ;
Ito, T ;
Ichiyoshi, N ;
Nishikawa, K .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :294-298