A two-stage design for multiple testing in large-scale association studies

被引:9
作者
Wen, Shu-Hui
Tzeng, Jung-Ying
Kao, Jau-Tsuen
Hsiao, Chuhsing Kate [1 ]
机构
[1] Natl Taiwan Univ, Inst Epidemiol, Div Biostat, Taipei 100, Taiwan
[2] Natl Taiwan Univ, Coll Med, Dept Clin Lab Sci & Med Biotechnol, Taipei 100, Taiwan
[3] N Carolina State Univ, Dept Stat, Raleigh, NC 27606 USA
[4] N Carolina State Univ, Bioinformat Res Ctr, Raleigh, NC 27606 USA
[5] Tzu Chi Univ, Coll Med, Dept Publ Hlth, Hualien 97004, Taiwan
关键词
association studies; cost-effectiveness; false positive rate; multiple testing; two-stage design;
D O I
10.1007/s10038-006-0393-6
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Modern association studies often involve a large number of markers and hence may encounter the problem of testing multiple hypotheses. Traditional procedures are usually over-conservative and with low power to detect mild genetic effects. From the design perspective, we propose a two-stage selection procedure to address this concern. Our main principle is to reduce the total number of tests by removing clearly unassociated markers in the first-stage test. Next, conditional on the findings of the first stage, which uses a less stringent nominal level, a more conservative test is conducted in the second stage using the augmented data and the data from the first stage. Previous studies have suggested using independent samples to avoid inflated errors. However, we found that, after accounting for the dependence between these two samples, the true discovery rate increases substantially. In addition, the cost of genotyping can be greatly reduced via this approach. Results from a study of hypertriglyceridemia and simulations suggest the two-stage method has a higher overall true positive rate (TPR) with a controlled overall false positive rate (FPR) when compared with single-stage approaches. We also report the analytical form of its overall FPR, which may be useful in guiding study design to achieve a high TPR while retaining the desired FPR.
引用
收藏
页码:523 / 532
页数:10
相关论文
共 33 条
[1]   Two-stage testing in microarray analysis: What is gained? [J].
Allison, DB ;
Coffey, CS .
JOURNALS OF GERONTOLOGY SERIES A-BIOLOGICAL SCIENCES AND MEDICAL SCIENCES, 2002, 57 (05) :B189-B192
[2]   Maximum-likelihood estimation of haplotype frequencies in nuclear families [J].
Becker, T ;
Knapp, M .
GENETIC EPIDEMIOLOGY, 2004, 27 (01) :21-32
[3]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]  
Böddeker IR, 2001, BIOMETRICAL J, V43, P501, DOI 10.1002/1521-4036(200108)43:4<501::AID-BIMJ501>3.0.CO
[6]  
2-I
[7]   Genetic epidemiology of single-nucleotide polymorphisms [J].
Collins, A ;
Lonjou, C ;
Morton, NE .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (26) :15173-15177
[8]  
DALE RN, 2004, AM J HUM GENET, V74, P765
[9]   Multiple hypothesis testing in microarray experiments [J].
Dudoit, S ;
Shaffer, JP ;
Boldrick, JC .
STATISTICAL SCIENCE, 2003, 18 (01) :71-103
[10]  
Elston RC, 1996, GENET EPIDEMIOL, V13, P535, DOI 10.1002/(SICI)1098-2272(1996)13:6<535::AID-GEPI2>3.3.CO