A Comparison of Gene Set Analysis Methods in Terms of Sensitivity, Prioritization and Specificity

被引:136
作者
Tarca, Adi L. [1 ,2 ]
Bhatti, Gaurav [2 ]
Romero, Roberto [2 ,3 ,4 ]
机构
[1] Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA
[2] NICHHD, Perinatol Res Branch, NIH, Rockville, MD USA
[3] Univ Michigan, Dept Obstet & Gynecol, Ann Arbor, MI 48109 USA
[4] Michigan State Univ, Dept Epidemiol & Biostat, E Lansing, MI 48824 USA
来源
PLOS ONE | 2013年 / 8卷 / 11期
基金
美国国家卫生研究院;
关键词
EXPRESSION; ENRICHMENT; PATHWAYS; BIOLOGY;
D O I
10.1371/journal.pone.0079217
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Identification of functional sets of genes associated with conditions of interest from omics data was first reported in 1999, and since, a plethora of enrichment methods were published for systematic analysis of gene sets collections including Gene Ontology and biological pathways. Despite their widespread usage in reducing the complexity of omics experiment results, their performance is poorly understood. Leveraging the existence of disease specific gene sets in KEGG and Metacore (R) databases, we compared the performance of sixteen methods under relaxed assumptions while using 42 real datasets (over 1,400 samples). Most of the methods ranked high the gene sets designed for specific diseases whenever samples from affected individuals were compared against controls via microarrays. The top methods for gene set prioritization were different from the top ones in terms of sensitivity, and four of the sixteen methods had large false positives rates assessed by permuting the phenotype of the samples. The best overall methods among those that generated reasonably low false positive rates, when permuting phenotypes, were PLAGE, GLOBALTEST, and PADOG. The best method in the category that generated higher than expected false positives was MRGSE.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Down-weighting overlapping genes improves gene set analysis
    Tarca, Adi Laurentiu
    Draghici, Sorin
    Bhatti, Gaurav
    Romero, Roberto
    BMC BIOINFORMATICS, 2012, 13 : 136
  • [32] XGSA: A statistical method for cross-species gene set analysis
    Djordjevic, Djordje
    Kusumi, Kenro
    Ho, JoshuaW. K.
    BIOINFORMATICS, 2016, 32 (17) : 620 - 628
  • [33] Multivariate analysis of variance test for gene set analysis
    Tsai, Chen-An
    Chen, James J.
    BIOINFORMATICS, 2009, 25 (07) : 897 - 903
  • [34] Robust multi-group gene set analysis with few replicates
    Mishra, Pashupati P.
    Medlar, Alan
    Holm, Liisa
    Toronen, Petri
    BMC BIOINFORMATICS, 2016, 17
  • [35] A comparative study on gene-set analysis methods for assessing differential expression associated with the survival phenotype
    Lee, Seungyeoun
    Kim, Jinheum
    Lee, Sunho
    BMC BIOINFORMATICS, 2011, 12
  • [36] Gene set analysis exploiting the topology of a pathway
    Massa, Maria Sofia
    Chiogna, Monica
    Romualdi, Chiara
    BMC SYSTEMS BIOLOGY, 2010, 4
  • [37] TopoGSA: network topological gene set analysis
    Glaab, Enrico
    Baudot, Anais
    Krasnogor, Natalio
    Valencia, Alfonso
    BIOINFORMATICS, 2010, 26 (09) : 1271 - 1272
  • [38] Gene set enrichment analysis made simple
    Irizarry, Rafael A.
    Wang, Chi
    Zhou, Yun
    Speed, Terence P.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2009, 18 (06) : 565 - 575
  • [39] Gene Set Analysis Using Spatial Statistics
    Riffo-Campos, Angela L.
    Ayala, Guillermo
    Montes, Francisco
    MATHEMATICS, 2021, 9 (05) : 1 - 13
  • [40] GiANT: gene set uncertainty in enrichment analysis
    Schmid, Florian
    Schmid, Matthias
    Muessel, Christoph
    Straeng, J. Eric
    Buske, Christian
    Bullinger, Lars
    Kraus, Johann M.
    Kestler, Hans A.
    BIOINFORMATICS, 2016, 32 (12) : 1891 - 1894