An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks

被引:198
作者
Botia, Juan A. [1 ,2 ]
Vandrovcova, Jana [2 ]
Forabosco, Paola [3 ]
Guelfi, Sebastian [2 ]
D'Sa, Karishma [1 ,2 ]
Hardy, John [2 ]
Lewis, Cathryn M. [1 ]
Ryten, Mina [1 ,2 ]
Weale, Michael E. [1 ]
机构
[1] UCL, Inst Neurol, Dept Mol Neurosci, Queen Sq, London WC1N, England
[2] Kings Coll London, Sch Med Sci, Dept Med & Mol Genet, Guys Hosp, London SE1 9RT, England
[3] Cittadella Univ Monserrato, CNR, Ist Ric Genet & Biomed, I-09042 Monserrato, CA, Italy
基金
英国医学研究理事会;
关键词
Gene co-expression networks on brain; K-means applied to WGCNA; Assessment of better gene clusters on bulk tissue; EXPRESSION DATA; GENOTYPE; INSIGHTS;
D O I
10.1186/s12918-017-0420-6
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn). Results: We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. Conclusions: The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.
引用
收藏
页数:16
相关论文
共 37 条
[1]   Scale-free networks in cell biology [J].
Albert, R .
JOURNAL OF CELL SCIENCE, 2005, 118 (21) :4947-4957
[2]   Comparing Statistical Methods for Constructing Large Scale Gene Networks [J].
Allen, Jeffrey D. ;
Xie, Yang ;
Chen, Min ;
Girard, Luc ;
Xiao, Guanghua .
PLOS ONE, 2012, 7 (01)
[3]  
[Anonymous], BELL SYST TECH J
[4]  
[Anonymous], 2007, SODA 07 P 18 ANN ACM
[5]  
[Anonymous], 2000, NATURE STAT LEARNING, DOI DOI 10.1007/978-1-4757-3264-1
[6]   The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans [J].
Ardlie, Kristin G. ;
DeLuca, David S. ;
Segre, Ayellet V. ;
Sullivan, Timothy J. ;
Young, Taylor R. ;
Gelfand, Ellen T. ;
Trowbridge, Casandra A. ;
Maller, Julian B. ;
Tukiainen, Taru ;
Lek, Monkol ;
Ward, Lucas D. ;
Kheradpour, Pouya ;
Iriarte, Benjamin ;
Meng, Yan ;
Palmer, Cameron D. ;
Esko, Tonu ;
Winckler, Wendy ;
Hirschhorn, Joel N. ;
Kellis, Manolis ;
MacArthur, Daniel G. ;
Getz, Gad ;
Shabalin, Andrey A. ;
Li, Gen ;
Zhou, Yi-Hui ;
Nobel, Andrew B. ;
Rusyn, Ivan ;
Wright, Fred A. ;
Lappalainen, Tuuli ;
Ferreira, Pedro G. ;
Ongen, Halit ;
Rivas, Manuel A. ;
Battle, Alexis ;
Mostafavi, Sara ;
Monlong, Jean ;
Sammeth, Michael ;
Mele, Marta ;
Reverter, Ferran ;
Goldmann, Jakob M. ;
Koller, Daphne ;
Guigo, Roderic ;
McCarthy, Mark I. ;
Dermitzakis, Emmanouil T. ;
Gamazon, Eric R. ;
Im, Hae Kyung ;
Konkashbaev, Anuar ;
Nicolae, Dan L. ;
Cox, Nancy J. ;
Flutre, Timothee ;
Wen, Xiaoquan ;
Stephens, Matthew .
SCIENCE, 2015, 348 (6235) :648-660
[7]   Guidance for RNA-seq co-expression network construction and analysis: safety in numbers [J].
Ballouz, S. ;
Verleyen, W. ;
Gillis, J. .
BIOINFORMATICS, 2015, 31 (13) :2123-2130
[8]   Insights From Cerebellar Transcriptomic Analysis Into the Pathogenesis of Ataxia [J].
Bettencourt, Conceicao ;
Ryten, Mina ;
Forabosco, Paola ;
Schorge, Stephanie ;
Hersheson, Joshua ;
Hardy, John ;
Houlden, Henry .
JAMA NEUROLOGY, 2014, 71 (07) :831-839
[9]   Gene Ontology Consortium: going forward [J].
Blake, J. A. ;
Christie, K. R. ;
Dolan, M. E. ;
Drabkin, H. J. ;
Hill, D. P. ;
Ni, L. ;
Sitnikov, D. ;
Burgess, S. ;
Buza, T. ;
Gresham, C. ;
McCarthy, F. ;
Pillai, L. ;
Wang, H. ;
Carbon, S. ;
Dietze, H. ;
Lewis, S. E. ;
Mungall, C. J. ;
Munoz-Torres, M. C. ;
Feuermann, M. ;
Gaudet, P. ;
Basu, S. ;
Chisholm, R. L. ;
Dodson, R. J. ;
Fey, P. ;
Mi, H. ;
Thomas, P. D. ;
Muruganujan, A. ;
Poudel, S. ;
Hu, J. C. ;
Aleksander, S. A. ;
McIntosh, B. K. ;
Renfro, D. P. ;
Siegele, D. A. ;
Attrill, H. ;
Brown, N. H. ;
Tweedie, S. ;
Lomax, J. ;
Osumi-Sutherland, D. ;
Parkinson, H. ;
Roncaglia, P. ;
Lovering, R. C. ;
Talmud, P. J. ;
Humphries, S. E. ;
Denny, P. ;
Campbell, N. H. ;
Foulger, R. E. ;
Chibucos, M. C. ;
Giglio, M. Gwinn ;
Chang, H. Y. ;
Finn, R. .
NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) :D1049-D1056
[10]   A transcriptome database for astrocytes, neurons, and oligodendrocytes: A new resource for understanding brain development and function [J].
Cahoy, John D. ;
Emery, Ben ;
Kaushal, Amit ;
Foo, Lynette C. ;
Zamanian, Jennifer L. ;
Christopherson, Karen S. ;
Xing, Yi ;
Lubischer, Jane L. ;
Krieg, Paul A. ;
Krupenko, Sergey A. ;
Thompson, Wesley J. ;
Barres, Ben A. .
JOURNAL OF NEUROSCIENCE, 2008, 28 (01) :264-278