RRW: repeated random walks on genome-scale protein networks for local cluster discovery

被引:137
作者
Macropol, Kathy [1 ]
Can, Tolga [2 ]
Singh, Ambuj K. [1 ]
机构
[1] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
[2] Middle E Tech Univ, Dept Comp Engn, TR-06531 Ankara, Turkey
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
COMPLEXES; ALGORITHMS; MODULES;
D O I
10.1186/1471-2105-10-283
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: We propose an efficient and biologically sensitive algorithm based on repeated random walks (RRW) for discovering functional modules, e. g., complexes and pathways, within large-scale protein networks. Compared to existing cluster identification techniques, RRW implicitly makes use of network topology, edge weights, and long range interactions between proteins. Results: We apply the proposed technique on a functional network of yeast genes and accurately identify statistically significant clusters of proteins. We validate the biological significance of the results using known complexes in the MIPS complex catalogue database and well-characterized biological processes. We find that 90% of the created clusters have the majority of their catalogued proteins belonging to the same MIPS complex, and about 80% have the majority of their proteins involved in the same biological process. We compare our method to various other clustering techniques, such as the Markov Clustering Algorithm (MCL), and find a significant improvement in the RRW clusters' precision and accuracy values. Conclusion: RRW, which is a technique that exploits the topology of the network, is more precise and robust in finding local clusters. In addition, it has the added flexibility of being able to find multifunctional proteins by allowing overlapping clusters.
引用
收藏
页数:10
相关论文
共 38 条
  • [1] CFinder:: locating cliques and overlapping modules in biological networks
    Adamcsek, B
    Palla, G
    Farkas, IJ
    Derényi, I
    Vicsek, T
    [J]. BIOINFORMATICS, 2006, 22 (08) : 1021 - 1023
  • [2] A roadmap of clustering algorithms: finding a match for a biomedical application
    Andreopoulos, Bill
    An, Aijun
    Wang, Xiaogang
    Schroeder, Michael
    [J]. BRIEFINGS IN BIOINFORMATICS, 2009, 10 (03) : 297 - 314
  • [3] [Anonymous], 2000, Ph.D. Thesis
  • [4] Iterative cluster analysis of protein interaction data
    Arnau, V
    Mars, S
    Marín, I
    [J]. BIOINFORMATICS, 2005, 21 (03) : 364 - 378
  • [5] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [6] Predicting protein complex membership using probabilistic network reliability
    Asthana, S
    King, OD
    Gibbons, FD
    Roth, FP
    [J]. GENOME RESEARCH, 2004, 14 (06) : 1170 - 1175
  • [7] An ensemble framework for clustering protein-protein interaction networks
    Asur, Sitaram
    Ucar, Duygu
    Parthasarathy, Srinivasan
    [J]. BIOINFORMATICS, 2007, 23 (13) : I29 - I40
  • [8] An automated method for finding molecular complexes in large protein interaction networks
    Bader, GD
    Hogue, CW
    [J]. BMC BIOINFORMATICS, 2003, 4 (1)
  • [9] Greedily building protein networks with confidence
    Bader, JS
    [J]. BIOINFORMATICS, 2003, 19 (15) : 1869 - 1874
  • [10] Prolinks: a database of protein functional linkages derived from coevolution
    Bowers, PM
    Pellegrini, M
    Thompson, MJ
    Fierro, J
    Yeates, TO
    Eisenberg, D
    [J]. GENOME BIOLOGY, 2004, 5 (05)