Positive false discovery rate estimate in step-wise variable selection

被引:2
作者
Li, Lang [1 ]
Hui, Siu [1 ]
机构
[1] Indiana Univ, Dept Med, Div Biostat, Indianapolis, IN USA
关键词
cross-validation; false discovery rate; multiple-comparisons; pharmacogenetics; variable selection;
D O I
10.1080/03610910701569614
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Selecting predictors to optimize the outcome prediction is an important statistical method. However, it usually ignores the false positives in the selected predictors. In this article, we advocate a conventional stepwise forward variable selection method based on the predicted residual sum of squares, and develop a positive false discovery rate (pFDR) estimate for the selected predictor subset, and a local pFDR estimate to prioritize the selected predictors. This pFDR estimate takes account of the existence of non null predictors, and is proved to be asymptotically conservative. In addition, we propose two views of a variable selection process: an overall and an individual test. An interesting feature of the overall test is that its power of selecting non null predictors increases with the proportion of non null predictors among all candidate predictors. Data analysis is illustrated with an example, in which genetic and clinical predictors were selected to predict the cholesterol level change after four months of tamoxifen treatment, and pFDR was estimated. Our method's performance is evaluated through statistical simulations.
引用
收藏
页码:1217 / 1231
页数:15
相关论文
共 23 条
[1]   Adaptive thresholding of wavelet coefficients [J].
Abramovich, F ;
Benjamini, Y .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1996, 22 (04) :351-361
[2]   RELATIONSHIP BETWEEN VARIABLE SELECTION AND DATA AUGMENTATION AND A METHOD FOR PREDICTION [J].
ALLEN, DM .
TECHNOMETRICS, 1974, 16 (01) :125-127
[3]  
[Anonymous], 1993, Resampling-based multiple testing: Examples and methods for P-value adjustment
[4]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[5]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[6]   A model selection approach for the identification of quantitative trait loci in experimental crosses [J].
Broman, KW ;
Speed, TP .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 :641-656
[7]   A RISK-BENEFIT ASSESSMENT OF TAMOXIFEN THERAPY [J].
CATHERINO, WH ;
JORDAN, VC .
DRUG SAFETY, 1993, 8 (05) :381-397
[8]  
Chang J, 1996, ANN ONCOL, V7, P671
[9]   Metabolism of tamoxifen by recombinant human cytochrome P450 enzymes:: Formation of the 4-hydroxy, 4′-hydroxy and N-desmethyl metabolites and isomerization of trans-4-hydroxytamoxifen [J].
Crewe, HK ;
Notley, LM ;
Wunsch, RM ;
Lennard, MS ;
Gillam, EMJ .
DRUG METABOLISM AND DISPOSITION, 2002, 30 (08) :869-874
[10]   Empirical Bayes analysis of a microarray experiment [J].
Efron, B ;
Tibshirani, R ;
Storey, JD ;
Tusher, V .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1151-1160