SLIDER: A Generic Metaheuristic for the Discovery of Correlated Motifs in Protein-Protein Interaction Networks

被引：6

作者：

Boyen, Peter ^{[1
,2
]}

Van Dyck, Dries ^{[1
,2
]}

Neven, Frank ^{[1
,2
]}

van Ham, Roeland C. H. J.

van Dijk, Aalt D. J. ^{[3
]}

机构：

[1] Hasselt Univ, B-3590 Diepenbeek, Belgium

[2] Transnatl Univ Limburg, B-3590 Diepenbeek, Belgium

[3] Wageningen Univ & Res Ctr, Bioinformat Grp, NL-6708 PB Wageningen, Netherlands

来源：

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS | 2011年 / 8卷 / 05期

关键词：

Graphs and networks; biology and genetics; PAIRS;

D O I：

10.1109/TCBB.2011.17

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Correlated motif mining (CMM) is the problem of finding overrepresented pairs of patterns, called motifs, in sequences of interacting proteins. Algorithmic solutions for CMM thereby provide a computational method for predicting binding sites for protein interaction. In this paper, we adopt a motif-driven approach where the support of candidate motif pairs is evaluated in the network. We experimentally establish the superiority of the Chi-square-based support measure over other support measures. Furthermore, we obtain that CMM is an NP-hard problem for a large class of support measures ( including Chi-square) and reformulate the search for correlated motifs as a combinatorial optimization problem. We then present the generic metaheuristic SLIDER which uses steepest ascent with a neighborhood function based on sliding motifs and employs the Chi-square-based support measure. We show that SLIDER outperforms existing motif-driven CMM methods and scales to large protein-protein interaction networks. The SLIDER-implementation and the data used in the experiments are available on http://bioinformatics.uhasselt.be.

引用

页码：1344 / 1357

页数：14

共 20 条

[1]

AARTS E.H.L., 1997, LOCAL SEARCH COMBINA

[2] Ten thousand interactions for the molecular biologist [J].

Aloy, P ;

Russell, RB .

NATURE BIOTECHNOLOGY, 2004, 22 (10) :1317-1321

[3]

[Anonymous], 1979, Computers and Intractablity: A Guide to the Theory of NP-Completeness

[4] The Protein Data Bank [J].

Berman, HM ;

Westbrook, J ;

Feng, Z ;

Gilliland, G ;

Bhat, TN ;

Weissig, H ;

Shindyalov, IN ;

Bourne, PE .

NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242

[5] Metaheuristics in combinatorial optimization: Overview and conceptual comparison [J].

Blum, C ;

Roli, A .

ACM COMPUTING SURVEYS, 2003, 35 (03) :268-308

[6] SLIDER: Mining correlated motifs in protein-protein interaction networks [J].

Boyen, Peter ;

Neven, Frank ;

Van Dyck, Dries ;

van Dijk, Aalt D. J. ;

van Ham, Roeland C. H. J. .

2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, :716-+

[7] Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae [J].

Collins, Sean R. ;

Kemmeren, Patrick ;

Zhao, Xue-Chu ;

Greenblatt, Jack F. ;

Spencer, Forrest ;

Holstege, Frank C. P. ;

Weissman, Jonathan S. ;

Krogan, Nevan J. .

MOLECULAR & CELLULAR PROTEOMICS, 2007, 6 (03) :439-450

[8] PRISM: A prime-encoding approach for frequent sequence mining [J].

Gouda, Karam ;

Hassaan, Mosab ;

Zaki, Mohammed J. .

ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, :487-+

[9]

Hubbard S., 1993, Computer Program, Department of Biochemistry Molecular Biology

[10] GTOP: a database of protein structures predicted from genome sequences [J].

Kawabata, T ;

Fukuchi, S ;

Homma, K ;

Ota, M ;

Araki, J ;

Ito, T ;

Ichiyoshi, N ;

Nishikawa, K .

NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :294-298

← 1 2 →