Weighted Kolmogorov Smirnov testing: an alternative for Gene Set Enrichment Analysis

被引:23
作者
Charmpi, Konstantina [1 ,2 ,3 ]
Ycart, Bernard [1 ,2 ,3 ]
机构
[1] Univ Grenoble Alpes, Grenoble, France
[2] CNRS, Lab Jean Kuntzmann, UMR5224, Grenoble, France
[3] Lab Excellence TOUCAN, Toulouse, France
关键词
empirical processes; GSEA; Monte-Carlo simulation; statistical test; weak convergence; DIFFERENTIAL EXPRESSION; IDENTIFICATION; REVEALS; MODELS;
D O I
10.1515/sagmb-2014-0077
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Gene Set Enrichment Analysis (GSEA) is a basic tool for genomic data treatment. Its test statistic is based on a cumulated weight function, and its distribution under the null hypothesis is evaluated by Monte-Carlo simulation. Here, it is proposed to subtract to the cumulated weight function its asymptotic expectation, then scale it. Under the null hypothesis, the convergence in distribution of the new test statistic is proved, using the theory of empirical processes. The limiting distribution needs to be computed only once, and can then be used for many different gene sets. This results in large savings in computing time. The test defined in this way has been called Weighted Kolmogorov Smirnov (WKS) test. Using expression data from the GEO repository, tested against the MSig Database C2, a comparison between the classical GSEA test and the new procedure has been conducted. Our conclusion is that, beyond its mathematical and algorithmic advantages, the WKS test could be more informative in many cases, than the classical GSEA test.
引用
收藏
页码:279 / 293
页数:15
相关论文
共 36 条
[1]   Analysis of the mechanisms mediating tumor-specific changes in gene expression in human liver tumors [J].
Acevedo, Luis G. ;
Bieda, Mark ;
Green, Roland ;
Farnham, Peggy J. .
CANCER RESEARCH, 2008, 68 (08) :2641-2651
[2]  
[Anonymous], ORG HS EG BD GENOME
[3]  
[Anonymous], HGUG4110B DB AGILENT
[4]  
Arnold TB, 2011, R J, V3, P34
[5]   Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1 [J].
Barbie, David A. ;
Tamayo, Pablo ;
Boehm, Jesse S. ;
Kim, So Young ;
Moody, Susan E. ;
Dunn, Ian F. ;
Schinzel, Anna C. ;
Sandy, Peter ;
Meylan, Etienne ;
Scholl, Claudia ;
Froehling, Stefan ;
Chan, Edmond M. ;
Sos, Martin L. ;
Michel, Kathrin ;
Mermel, Craig ;
Silver, Serena J. ;
Weir, Barbara A. ;
Reiling, Jan H. ;
Sheng, Qing ;
Gupta, Piyush B. ;
Wadlow, Raymond C. ;
Le, Hanh ;
Hoersch, Sebastian ;
Wittner, Ben S. ;
Ramaswamy, Sridhar ;
Livingston, David M. ;
Sabatini, David M. ;
Meyerson, Matthew ;
Thomas, Roman K. ;
Lander, Eric S. ;
Mesirov, Jill P. ;
Root, David E. ;
Gilliland, D. Gary ;
Jacks, Tyler ;
Hahn, William C. .
NATURE, 2009, 462 (7269) :108-U122
[6]   The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity [J].
Barretina, Jordi ;
Caponigro, Giordano ;
Stransky, Nicolas ;
Venkatesan, Kavitha ;
Margolin, Adam A. ;
Kim, Sungjoon ;
Wilson, Christopher J. ;
Lehar, Joseph ;
Kryukov, Gregory V. ;
Sonkin, Dmitriy ;
Reddy, Anupama ;
Liu, Manway ;
Murray, Lauren ;
Berger, Michael F. ;
Monahan, John E. ;
Morais, Paula ;
Meltzer, Jodi ;
Korejwa, Adam ;
Jane-Valbuena, Judit ;
Mapa, Felipa A. ;
Thibault, Joseph ;
Bric-Furlong, Eva ;
Raman, Pichai ;
Shipway, Aaron ;
Engels, Ingo H. ;
Cheng, Jill ;
Yu, Guoying K. ;
Yu, Jianjun ;
Aspesi, Peter, Jr. ;
de Silva, Melanie ;
Jagtap, Kalpana ;
Jones, Michael D. ;
Wang, Li ;
Hatton, Charles ;
Palescandolo, Emanuele ;
Gupta, Supriya ;
Mahan, Scott ;
Sougnez, Carrie ;
Onofrio, Robert C. ;
Liefeld, Ted ;
MacConaill, Laura ;
Winckler, Wendy ;
Reich, Michael ;
Li, Nanxin ;
Mesirov, Jill P. ;
Gabriel, Stacey B. ;
Getz, Gad ;
Ardlie, Kristin ;
Chan, Vivien ;
Myer, Vic E. .
NATURE, 2012, 483 (7391) :603-607
[7]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[8]   Application of a priori established gene sets to discover biologically important differential expression in microarray data [J].
Bild, A ;
Febbo, PG .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (43) :15278-15279
[9]  
Dudoit S, 2008, SPRINGER SER STAT, P1
[10]   Gene Expression Omnibus: NCBI gene expression and hybridization array data repository [J].
Edgar, R ;
Domrachev, M ;
Lash, AE .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :207-210