Robust rank aggregation for gene list integration and meta-analysis

被引:812
作者
Kolde, Raivo [1 ,2 ]
Laur, Sven [1 ]
Adler, Priit [1 ,3 ]
Vilo, Jaak [1 ,2 ]
机构
[1] Univ Tartu, Inst Comp Sci, EE-50409 Tartu, Estonia
[2] Quretec, EE-51003 Tartu, Estonia
[3] Univ Tartu, Inst Mol & Cell Biol, EE-51010 Tartu, Estonia
关键词
SACCHAROMYCES-CEREVISIAE; MICROARRAY EXPERIMENTS; EXPRESSION PROFILES; GENOMIC DATA; CANCER; COEXPRESSION; NETWORK; VALIDATION; REVEALS; ARCHIVE;
D O I
10.1093/bioinformatics/btr709
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The continued progress in developing technological platforms, availability of many published experimental datasets, as well as different statistical methods to analyze those data have allowed approaching the same research question using various methods simultaneously. To get the best out of all these alternatives, we need to integrate their results in an unbiased manner. Prioritized gene lists are a common result presentation method in genomic data analysis applications. Thus, the rank aggregation methods can become a useful and general solution for the integration task. Results: Standard rank aggregation methods are often ill-suited for biological settings where the gene lists are inherently noisy. As a remedy, we propose a novel robust rank aggregation (RRA) method. Our method detects genes that are ranked consistently better than expected under null hypothesis of uncorrelated inputs and assigns a significance score for each gene. The underlying probabilistic model makes the algorithm parameter free and robust to outliers, noise and errors. Significance scores also provide a rigorous way to keep only the statistically relevant genes in the final list. These properties make our approach robust and compelling for many settings.
引用
收藏
页码:573 / 580
页数:8
相关论文
共 28 条
[1]   Mining for coexpression across hundreds of datasets using novel rank aggregation and visualization methods [J].
Adler, Priit ;
Kolde, Raivo ;
Kull, Meelis ;
Tkachenko, Aleksandr ;
Peterson, Hedi ;
Reimand, Jueri ;
Vilo, Jaak .
GENOME BIOLOGY, 2009, 10 (12)
[2]   Gene prioritization through genomic data fusion [J].
Aerts, S ;
Lambrechts, D ;
Maity, S ;
Van Loo, P ;
Coessens, B ;
De Smet, F ;
Tranchevent, LC ;
De Moor, B ;
Marynen, P ;
Hassan, B ;
Carmeliet, P ;
Moreau, Y .
NATURE BIOTECHNOLOGY, 2006, 24 (05) :537-544
[3]   NCBI GEO: archive for high-throughput functional genomic data [J].
Barrett, Tanya ;
Troup, Dennis B. ;
Wilhite, Stephen E. ;
Ledoux, Pierre ;
Rudnev, Dmitry ;
Evangelista, Carlos ;
Kim, Irene F. ;
Soboleva, Alexandra ;
Tomashevsky, Maxim ;
Marshall, Kimberly A. ;
Phillippy, Katherine H. ;
Sherman, Patti M. ;
Muertter, Rolf N. ;
Edgar, Ron .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D885-D890
[4]  
Bie T.D., 2007, BIOINFORMATICS, V23, pi125
[5]   Stability and aggregation of ranked gene lists [J].
Boulesteix, Anne-Laure ;
Slawski, Martin .
BRIEFINGS IN BIOINFORMATICS, 2009, 10 (05) :556-568
[6]   Meta-analysis of microarray results: challenges, opportunities, and recommendations for standardization [J].
Cahan, Patrick ;
Rovegno, Felicia ;
Mooney, Denise ;
Newman, John C. ;
St. Laurent, Georges, III ;
McCaffrey, Timothy A. .
GENE, 2007, 401 (1-2) :12-18
[7]   Integration of external signaling pathways with the core transcriptional network in embryonic stem cells [J].
Chen, Xi ;
Xu, Han ;
Yuan, Ping ;
Fang, Fang ;
Huss, Mikael ;
Vega, Vinsensius B. ;
Wong, Eleanor ;
Orlov, Yuriy L. ;
Zhang, Weiwei ;
Jiang, Jianming ;
Loh, Yuin-Han ;
Yeo, Hock Chuan ;
Yeo, Zhen Xuan ;
Narang, Vipin ;
Govindarajan, Kunde Ramamoorthy ;
Leong, Bernard ;
Shahab, Atif ;
Ruan, Yijun ;
Bourque, Guillaume ;
Sung, Wing-Kin ;
Clarke, Neil D. ;
Wei, Chia-Lin ;
Ng, Huck-Hui .
CELL, 2008, 133 (06) :1106-1117
[8]  
Copeland A. H., 1951, TECHNICAL REPORT
[9]   Comparison of computational methods for the identification of cell cycle-regulated genes [J].
de Lichtenberg, U ;
Jensen, LJ ;
Fausboll, A ;
Jensen, TS ;
Bork, P ;
Brunak, S .
BIOINFORMATICS, 2005, 21 (07) :1164-1171
[10]  
DeConde R, 2006, STAT APPL GENET MOL, V5