Factor Analysis for Multiple Testing (FAMT): An R Package for Large-Scale Significance Testing under Dependence

被引:0
作者
Causeur, David [1 ]
Friguet, Chloe [1 ]
Houee-Bigot, Magalie [1 ]
Kloareg, Maela [1 ]
机构
[1] Agrocampus Ouest, Dept Appl Math, F-35000 Rennes, France
来源
JOURNAL OF STATISTICAL SOFTWARE | 2011年 / 40卷 / 14期
关键词
factor analysis; multiple testing; dependence; false discovery rate; non discovery rate; R;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The R package FAMT (factor analysis for multiple testing) provides a powerful method for large-scale significance testing under dependence. It is especially designed to select differentially expressed genes in microarray data when the correlation structure among gene expressions is strong. Indeed, this method reduces the negative impact of dependence on the multiple testing procedures by modeling the common information shared by all the variables using a factor analysis structure. New test statistics for general linear contrasts are deduced, taking advantage of the common factor structure to reduce correlation and consequently the variance of error rates. Thus, the FAMT method shows improvements with respect to most of the usual methods regarding the non discovery rate and the control of the false discovery rate (FDR). The steps of this procedure, each of them corresponding to R functions, are illustrated in this paper by two microarray data analyses. We first present how to import the gene expression data, the covariates and gene annotations. The second step includes the choice of the optimal number of factors, the factor model fitting, and provides a list of selected gene according to a preset FDR control level. Finally, diagnostic plots are provided to help the user interpret the factors using a vailable external information on either genes or arrays.
引用
收藏
页码:1 / 19
页数:19
相关论文
共 20 条
[1]  
[Anonymous], 2011, R: A Language and Environment for Statistical Computing
[2]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[3]   A note on the adaptive control of false discovery rates [J].
Black, MA .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2004, 66 :297-304
[4]   A factor model to analyze heterogeneity in gene expression [J].
Blum, Yuna ;
Le Mignon, Guillaume ;
Lagarrigue, Sandrine ;
Causeur, David .
BMC BIOINFORMATICS, 2010, 11
[5]   SCREE TEST FOR NUMBER OF FACTORS [J].
CATTELL, RB .
MULTIVARIATE BEHAVIORAL RESEARCH, 1966, 1 (02) :245-276
[6]  
CAUSEUR D, 2010, FACTOR ANAL MULTIPLE
[7]   Transcriptome profiling of the feeding-to-fasting transition in chicken liver [J].
Desert, Colette ;
Duclos, Michel J. ;
Blavy, Pierre ;
Lecerf, Frederic ;
Moreews, Francois ;
Klopp, Christophe ;
Aubry, Marc ;
Herault, Frederic ;
Le Roy, Pascale ;
Berri, Cecile ;
Douaire, Madeleine ;
Diot, Christian ;
Lagarrigue, Sandrine .
BMC GENOMICS, 2008, 9 (1)
[8]   Correlation and large-scale simultaneous significance testing [J].
Efron, Bradley .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (477) :93-103
[9]  
FRIGUET C, 2010, ESTIMATION PROPORTIO
[10]   A Factor Model Approach to Multiple Testing Under Dependence [J].
Friguet, Chloe ;
Kloareg, Maela ;
Causeur, David .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2009, 104 (488) :1406-1415