Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering

被引:49
作者
Gaiteri, Chris [1 ,2 ]
Chen, Mingming [3 ]
Szymanski, Boleslaw [3 ,4 ]
Kuzmin, Konstantin [3 ]
Xie, Jierui [3 ,5 ]
Lee, Changkyu [2 ]
Blanche, Timothy [2 ]
Neto, Elias Chaibub [6 ]
Huang, Su-Chun [7 ]
Grabowski, Thomas [7 ,8 ]
Madhyastha, Tara [8 ]
Komashko, Vitalina [9 ]
机构
[1] Rush Univ, Med Ctr, Alzheimers Dis Ctr, Chicago, IL 60612 USA
[2] Allen Inst Brain Sci Modeling Anal & Theory Grp, Seattle, WA USA
[3] Rensselaer Polytech Inst, Dept Comp Sci, Troy, NY 12180 USA
[4] Spoleczna Akad Nauk, Lodz, Poland
[5] Samsung Res Amer, San Jose, CA USA
[6] Sage Bionetworks, Seattle, WA USA
[7] Univ Washington, Dept Neurol, Seattle, WA 98195 USA
[8] Univ Washington, Dept Radiol, Seattle, WA 98195 USA
[9] Trialomics, Seattle, WA USA
来源
SCIENTIFIC REPORTS | 2015年 / 5卷
基金
美国国家卫生研究院;
关键词
SMALL-WORLD; COEXPRESSION NETWORKS; COMPREHENSIVE ATLAS; ANALYSIS REVEALS; ORGANIZATION; ANNOTATION; MODULARITY; DYNAMICS;
D O I
10.1038/srep16361
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. These detected communities may also be unstable and difficult to replicate, because traditional methods are sensitive to noise and parameter settings. These aspects of traditional clustering methods limit our ability to detect biological communities, and therefore our ability to understand biological functions. To address these limitations and detect robust overlapping biological communities, we propose an unorthodox clustering method called SpeakEasy which identifies communities using top-down and bottom-up approaches simultaneously. Specifically, nodes join communities based on their local connections, as well as global information about the network structure. This method can quantify the stability of each community, automatically identify the number of communities, and quickly cluster networks with hundreds of thousands of nodes. SpeakEasy shows top performance on synthetic clustering benchmarks and accurately identifies meaningful biological communities in a range of datasets, including: gene microarrays, protein interactions, sorted cell populations, electrophysiology and fMRI brain imaging.
引用
收藏
页数:14
相关论文
共 58 条
  • [1] Deciphering Network Community Structure by Surprise
    Aldecoa, Rodrigo
    Marin, Ignacio
    [J]. PLOS ONE, 2011, 6 (09):
  • [2] An ensemble framework for clustering protein-protein interaction networks
    Asur, Sitaram
    Ucar, Duygu
    Parthasarathy, Srinivasan
    [J]. BIOINFORMATICS, 2007, 23 (13) : I29 - I40
  • [3] The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity
    Barretina, Jordi
    Caponigro, Giordano
    Stransky, Nicolas
    Venkatesan, Kavitha
    Margolin, Adam A.
    Kim, Sungjoon
    Wilson, Christopher J.
    Lehar, Joseph
    Kryukov, Gregory V.
    Sonkin, Dmitriy
    Reddy, Anupama
    Liu, Manway
    Murray, Lauren
    Berger, Michael F.
    Monahan, John E.
    Morais, Paula
    Meltzer, Jodi
    Korejwa, Adam
    Jane-Valbuena, Judit
    Mapa, Felipa A.
    Thibault, Joseph
    Bric-Furlong, Eva
    Raman, Pichai
    Shipway, Aaron
    Engels, Ingo H.
    Cheng, Jill
    Yu, Guoying K.
    Yu, Jianjun
    Aspesi, Peter, Jr.
    de Silva, Melanie
    Jagtap, Kalpana
    Jones, Michael D.
    Wang, Li
    Hatton, Charles
    Palescandolo, Emanuele
    Gupta, Supriya
    Mahan, Scott
    Sougnez, Carrie
    Onofrio, Robert C.
    Liefeld, Ted
    MacConaill, Laura
    Winckler, Wendy
    Reich, Michael
    Li, Nanxin
    Mesirov, Jill P.
    Gabriel, Stacey B.
    Getz, Gad
    Ardlie, Kristin
    Chan, Vivien
    Myer, Vic E.
    [J]. NATURE, 2012, 483 (7391) : 603 - 607
  • [4] Human brain networks in health and disease
    Bassett, Danielle S.
    Bullmore, Edward T.
    [J]. CURRENT OPINION IN NEUROLOGY, 2009, 22 (04) : 340 - 347
  • [5] Cycle-by-cycle assembly of respiratory network activity is dynamic and stochastic
    Carroll, Michael S.
    Ramirez, Jan-Marino
    [J]. JOURNAL OF NEUROPHYSIOLOGY, 2013, 109 (02) : 296 - 305
  • [6] Chen M., 2013, ASE Hum J, V1, P226
  • [7] Community Detection via Maximization of Modularity and Its Variants
    Chen, Mingming
    Kuzmin, Konstantin
    Szymanski, Boleslaw K.
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2014, 1 (01): : 46 - 65
  • [8] Chien-Cheng Lee, 2010, 2010 International Computer Symposium (ICS 2010), P1, DOI 10.1109/COMPSYM.2010.5685519
  • [9] Attractor Landscape Analysis Reveals Feedback Loops in the p53 Network That Control the Cellular Response to DNA Damage
    Choi, Minsoo
    Shi, Jue
    Jung, Sung Hoon
    Chen, Xi
    Cho, Kwang-Hyun
    [J]. SCIENCE SIGNALING, 2012, 5 (251)
  • [10] Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae
    Collins, Sean R.
    Kemmeren, Patrick
    Zhao, Xue-Chu
    Greenblatt, Jack F.
    Spencer, Forrest
    Holstege, Frank C. P.
    Weissman, Jonathan S.
    Krogan, Nevan J.
    [J]. MOLECULAR & CELLULAR PROTEOMICS, 2007, 6 (03) : 439 - 450