clusterMaker: a multi-algorithm clustering plugin for Cytoscape

被引:444
作者
Morris, John H. [1 ]
Apeltsin, Leonard [1 ]
Newman, Aaron M. [2 ]
Baumbach, Jan [4 ]
Wittkop, Tobias [3 ]
Su, Gang [5 ,6 ]
Bader, Gary D. [7 ,8 ]
Ferrin, Thomas E. [1 ,9 ]
机构
[1] Univ Calif San Francisco, Dept Pharmaceut Chem, San Francisco, CA 94143 USA
[2] Stanford Univ, Sch Med, Inst Stem Cell Biol & Regenerat Med, Stanford, CA 94305 USA
[3] Buck Inst Age Res, Novato, CA USA
[4] Max Planck Inst Informat, Saarbrucken, Germany
[5] Univ Michigan, Bioinformat Program, Ann Arbor, MI 48109 USA
[6] Univ Michigan, Natl Ctr Integrat Biomed Informat, Ann Arbor, MI 48109 USA
[7] Univ Toronto, Donnelly Ctr, Toronto, ON, Canada
[8] Univ Toronto, Dept Mol Genet, Toronto, ON, Canada
[9] Univ Calif San Francisco, Dept Bioengn & Therapeut Sci, San Francisco, CA 94143 USA
关键词
PROTEIN COMPLEXES; EXPRESSION DATA; INTERACTION MAP; SEQUENCE SPACE; CLASSIFICATION; IDENTIFICATION; YEAST; VISUALIZATION; SYSTEM; GENOME;
D O I
10.1186/1471-2105-12-436
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view), k-means, k-medoid, SCPS, AutoSOME, and native (Java) MCL. Results: Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC) enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions: The Cytoscape plugin clusterMaker provides a number of clustering algorithms and visualizations that can be used independently or in combination for analysis and visualization of biological data sets, and for confirming or generating hypotheses about biological function. Several of these visualizations and algorithms are only available to Cytoscape users through the clusterMaker plugin. clusterMaker is available via the Cytoscape plugin manager.
引用
收藏
页数:14
相关论文
共 65 条
[1]   Clustering of proximal sequence space for the identification of protein families [J].
Abascal, F ;
Valencia, A .
BIOINFORMATICS, 2002, 18 (07) :908-921
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
[Anonymous], 2000, A cluster algorithm for graphs, DOI DOI 10.1016/J.COSREV.2007.05.001
[4]   Improving the quality of protein similarity network clustering algorithms using the network edge weight distribution [J].
Apeltsin, Leonard ;
Morris, John H. ;
Babbitt, Patricia C. ;
Ferrin, Thomas E. .
BIOINFORMATICS, 2011, 27 (03) :326-333
[5]   Proteome Analysis Database: online application of InterPro and CluSTr for the functional classification of proteins in whole genomes [J].
Apweiler, R ;
Biswas, W ;
Fleischmann, W ;
Kanapin, A ;
Karavidopoulou, Y ;
Kersey, P ;
Kriventseva, EV ;
Mittard, V ;
Mulder, N ;
Phan, I ;
Zdobnov, E .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :44-48
[6]   Mechanistic diversity in a metalloenzyme superfamily [J].
Armstrong, RN .
BIOCHEMISTRY, 2000, 39 (45) :13625-13632
[7]  
Babbitt P.C., 2011, Exploring the VOC superfamily
[8]   An automated method for finding molecular complexes in large protein interaction networks [J].
Bader, GD ;
Hogue, CW .
BMC BIOINFORMATICS, 2003, 4 (1)
[9]  
Bezdek J. C., 1981, Pattern recognition with fuzzy objective function algorithms
[10]  
Blatt M, 1996, PHYS REV LETERS, V76