Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering

被引:49
作者
Gaiteri, Chris [1 ,2 ]
Chen, Mingming [3 ]
Szymanski, Boleslaw [3 ,4 ]
Kuzmin, Konstantin [3 ]
Xie, Jierui [3 ,5 ]
Lee, Changkyu [2 ]
Blanche, Timothy [2 ]
Neto, Elias Chaibub [6 ]
Huang, Su-Chun [7 ]
Grabowski, Thomas [7 ,8 ]
Madhyastha, Tara [8 ]
Komashko, Vitalina [9 ]
机构
[1] Rush Univ, Med Ctr, Alzheimers Dis Ctr, Chicago, IL 60612 USA
[2] Allen Inst Brain Sci Modeling Anal & Theory Grp, Seattle, WA USA
[3] Rensselaer Polytech Inst, Dept Comp Sci, Troy, NY 12180 USA
[4] Spoleczna Akad Nauk, Lodz, Poland
[5] Samsung Res Amer, San Jose, CA USA
[6] Sage Bionetworks, Seattle, WA USA
[7] Univ Washington, Dept Neurol, Seattle, WA 98195 USA
[8] Univ Washington, Dept Radiol, Seattle, WA 98195 USA
[9] Trialomics, Seattle, WA USA
来源
SCIENTIFIC REPORTS | 2015年 / 5卷
基金
美国国家卫生研究院;
关键词
SMALL-WORLD; COEXPRESSION NETWORKS; COMPREHENSIVE ATLAS; ANALYSIS REVEALS; ORGANIZATION; ANNOTATION; MODULARITY; DYNAMICS;
D O I
10.1038/srep16361
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. These detected communities may also be unstable and difficult to replicate, because traditional methods are sensitive to noise and parameter settings. These aspects of traditional clustering methods limit our ability to detect biological communities, and therefore our ability to understand biological functions. To address these limitations and detect robust overlapping biological communities, we propose an unorthodox clustering method called SpeakEasy which identifies communities using top-down and bottom-up approaches simultaneously. Specifically, nodes join communities based on their local connections, as well as global information about the network structure. This method can quantify the stability of each community, automatically identify the number of communities, and quickly cluster networks with hundreds of thousands of nodes. SpeakEasy shows top performance on synthetic clustering benchmarks and accurately identifies meaningful biological communities in a range of datasets, including: gene microarrays, protein interactions, sorted cell populations, electrophysiology and fMRI brain imaging.
引用
收藏
页数:14
相关论文
共 58 条
  • [11] Structure and dynamics of molecular networks: A novel paradigm of drug discovery A comprehensive review
    Csermely, Peter
    Korcsmaros, Tamas
    Kiss, Huba J. M.
    London, Gabor
    Nussinov, Ruth
    [J]. PHARMACOLOGY & THERAPEUTICS, 2013, 138 (03) : 333 - 408
  • [12] From 'differential expression' to 'differential networking' - identification of dysfunctional regulatory networks in diseases
    de la Fuente, Alberto
    [J]. TRENDS IN GENETICS, 2010, 26 (07) : 326 - 333
  • [13] Mixing local and global information for community detection in large networks
    De Meo, Pasquale
    Ferrara, Emilio
    Fiumara, Giacomo
    Provetti, Alessandro
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2014, 80 (01) : 72 - 87
  • [14] Multi-tissue coexpression networks reveal unexpected subnetworks associated with disease
    Dobrin, Radu
    Zhu, Jun
    Molony, Cliona
    Argman, Carmen
    Parrish, Mark L.
    Carlson, Sonia
    Allan, Mark F.
    Pomp, Daniel
    Schadt, Eric E.
    [J]. GENOME BIOLOGY, 2009, 10 (05):
  • [15] PathNet: a tool for pathway analysis using topological information
    Dutta, Bhaskar
    Wallqvist, Anders
    Reifman, Jaques
    [J]. SOURCE CODE FOR BIOLOGY AND MEDICINE, 2012, 7 (01):
  • [16] Community detection in graphs
    Fortunato, Santo
    [J]. PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS, 2010, 486 (3-5): : 75 - 174
  • [17] Beyond modules and hubs: the potential of gene coexpression networks for investigating molecular mechanisms of complex brain disorders
    Gaiteri, C.
    Ding, Y.
    French, B.
    Tseng, G. C.
    Sibille, E.
    [J]. GENES BRAIN AND BEHAVIOR, 2014, 13 (01) : 13 - 24
  • [18] Proteome survey reveals modularity of the yeast cell machinery
    Gavin, AC
    Aloy, P
    Grandi, P
    Krause, R
    Boesche, M
    Marzioch, M
    Rau, C
    Jensen, LJ
    Bastuck, S
    Dümpelfeld, B
    Edelmann, A
    Heurtier, MA
    Hoffman, V
    Hoefert, C
    Klein, K
    Hudak, M
    Michon, AM
    Schelder, M
    Schirle, M
    Remor, M
    Rudi, T
    Hooper, S
    Bauer, A
    Bouwmeester, T
    Casari, G
    Drewes, G
    Neubauer, G
    Rick, JM
    Kuster, B
    Bork, P
    Russell, RB
    Superti-Furga, G
    [J]. NATURE, 2006, 440 (7084) : 631 - 636
  • [19] Cluster ensembles
    Ghosh, Joydeep
    Acharya, Ayan
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (04) : 305 - 315
  • [20] Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move
    Grzegorczyk, Marco
    Husmeier, Dirk
    [J]. MACHINE LEARNING, 2008, 71 (2-3) : 265 - 305