Optimal screening for promising genes in 2-stage designs

被引:5
作者
Moerkerke, B. [1 ]
Goetghebeur, E. [1 ,2 ]
机构
[1] Univ Ghent, Dept Appl Math & Comp Sci, B-9000 Ghent, Belgium
[2] Harvard Univ, Sch Med, Dept Biostat, Boston, MA 02115 USA
基金
美国国家卫生研究院;
关键词
alternative p-value; balanced test; cost-efficient screening; false discovery rate; gene selection; multiple testing; optimal designs; two-stage designs;
D O I
10.1093/biostatistics/kxn002
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Detecting genetic markers with biologically relevant effects remains a challenge due to multiple testing. Standard analysis methods focus on evidence against the null and protect primarily the type I error. On the other hand, the worthwhile alternative is specified for power calculations at the design stage. The balanced test as proposed by Moerkerke and others (2006) and Moerkerke and Goetghebeur (2006) incorporates this alternative directly in the decision criterion to achieve better power. Genetic markers are selected and ranked in order of the balance of evidence they contain against the null and the target alternative. In this paper, we build on this guiding principle to develop 2-stage designs for screening genetic markers when the cost of measurements is high. For a given marker, a first sample may already provide sufficient evidence for or against the alternative. If not, more data are gathered at the second stage which is then followed by a binary decision based on all available data. By optimizing parameters which determine the decision process over the 2 stages (such as the area of the "gray" zone which leads to the gathering of extra data), the expected cost per marker can be reduced substantially. We also demonstrate that, compared to 1-stage designs, 2-stage designs achieve a better balance between true negatives and positives for the same cost.
引用
收藏
页码:700 / 714
页数:15
相关论文
共 24 条
[1]   A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes [J].
Baldi, P ;
Long, AD .
BIOINFORMATICS, 2001, 17 (06) :509-519
[2]   Key concepts in genetic epidemiology [J].
Burton, PR ;
Tobin, MD ;
Hopper, JL .
LANCET, 2005, 366 (9489) :941-951
[3]   Balancing false positives and false negatives for the detection of differential expression in malignancies [J].
De Smet, F ;
Moreau, Y ;
Engelen, K ;
Timmerman, D ;
Vergote, I ;
De Moor, B .
BRITISH JOURNAL OF CANCER, 2004, 91 (06) :1160-1165
[4]   Multiple-testing strategy for analyzing cDNA array data on gene expression [J].
Delongchamp, RR ;
Bowyer, JF ;
Chen, JJ ;
Kodell, RL .
BIOMETRICS, 2004, 60 (03) :774-782
[5]  
HUBER W, 2003, HDB STAT GENETICS, P162
[6]   The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster [J].
Jin, W ;
Riley, RM ;
Wolfinger, RD ;
White, KP ;
Passador-Gurgel, G ;
Gibson, G .
NATURE GENETICS, 2001, 29 (04) :389-395
[7]   Power and sample size for DNA microarray studies [J].
Lee, MLT ;
Whitmore, GA .
STATISTICS IN MEDICINE, 2002, 21 (23) :3543-3570
[8]  
Lönnstedt I, 2002, STAT SINICA, V12, P31
[9]   Significance and impotence:: towards a balanced view of the null and the alternative hypotheses in marker selection for plant breeding [J].
Moerkerke, B ;
Goetghebeur, E ;
De Riek, J ;
Roldán-Ruiz, I .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2006, 169 :61-79
[10]   Selecting "significant" differentially expressed genes from the combined perspective of the null and the alternative [J].
Moerkerke, B. ;
Goetghebeur, E. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2006, 13 (09) :1513-1531