Analysis of multiple SNPs in genetic association studies: Comparison of three multi-locus methods to prioritize and select SNPs

被引:34
作者
Heidema, A. Geert [1 ,2 ]
Feskens, Edith J. M. [3 ]
Doevendans, Pieter A. F. M. [4 ]
Ruven, Henk J. T. [5 ]
Van Houwelingen, Hans C. [1 ,2 ,6 ]
Mariman, Edwin C. M.
Boer, Jolanda M. A. [1 ,2 ]
机构
[1] Maastricht Univ, Dept Human Biol, NL-6200 MD Maastricht, Netherlands
[2] Natl Inst Publ Hlth & Environm, Ctr Nutr & Hlth, NL-3720 BA Bilthoven, Netherlands
[3] Univ Wageningen & Res Ctr, Div Human Nutr, Wageningen, Netherlands
[4] Univ Utrecht, Med Ctr, Heart Lung Ctr Utrecht, Utrecht, Netherlands
[5] St Antonius Hosp, Dept Clin Chem, Nieuwegein, Netherlands
[6] Leiden Univ, Med Ctr, Dept Med Stat, Leiden, Netherlands
关键词
multi-locus methods; set association; random forests; MDR;
D O I
10.1002/gepi.20251
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Nonparametric approaches have been developed that are able to analyze large numbers of single nucleoticle polymorphisms (SNPs) in modest sample sizes. These approaches have different selection features and may not provide similar results when applied to the same dataset. Therefore, we compared the results of three approaches (set association, random forests and multifactor dimensionality reduction [MDR]) to select from a total of 93 candidate SNPs a subset of SNPs that are important in determining high-density lipoprotein (HDL)-cholesterol levels. The study population consisted of a random sample from a Dutch monitoring project for cardiovascular disease risk factors and was dichotomized into cases (low HDL-cholesterol, n = 533) and non-cases (high HDL-cholesterol, n = 545) based on gender-specific median values for HDL cholesterol. Clearly, all three approaches prioritized three SNPs as important (CETP Taq1B, CETP-629 C/A and LPL Ser447X). Two SNPs with weaker main effects were additionally prioritized by random forests (APOC3 3175 G/C and CCR2 Va162Ile), whereas MTHFR 677 C/T was selected in combination with CETP Taq1B as best model by MDR. Obtained p-values for the selected models were significant for the set association approach (p =.0019), random forests (p <.01) and MDR (p <.02). In conclusion, the application of a combination of multi-locus methods is a useful approach in genetic association studies to select a well-defined set of important SNPs for further statistical and epidemiological interpretation, providing increased confidence and more information compared with the application of only one method.
引用
收藏
页码:910 / 921
页数:12
相关论文
共 37 条
[1]  
[Anonymous], 1979, Theoretical statistics
[2]  
Bellman R. E., 1961, ADAPTIVE CONTROL PRO, DOI DOI 10.1515/9781400874668
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]   Identifying SNPs predictive of phenotype using random forests [J].
Bureau, A ;
Dupuis, J ;
Falls, K ;
Lunetta, KL ;
Hayward, B ;
Keith, TP ;
Van Eerdewegh, P .
GENETIC EPIDEMIOLOGY, 2005, 28 (02) :171-182
[5]   Multifactor-dimensionality reduction shows a two-locus interaction associated with Type 2 diabetes mellitus [J].
Cho, YM ;
Ritchie, MD ;
Moore, JH ;
Park, JY ;
Lee, KU ;
Shin, HD ;
Lee, HK ;
Park, KS .
DIABETOLOGIA, 2004, 47 (03) :549-554
[6]   Microarray data mining with visual programming [J].
Curk, T ;
Demsar, J ;
Xu, QK ;
Leban, G ;
Petrovic, U ;
Bratko, I ;
Shaulsky, G ;
Zupan, B .
BIOINFORMATICS, 2005, 21 (03) :396-398
[7]   Glucocorticoid-related genetic susceptibility for Alzheimer's disease [J].
de Quervain, DJF ;
Poirier, R ;
Wollmer, MA ;
Grimaldi, LME ;
Tsolaki, M ;
Streffer, JR ;
Hock, C ;
Nitsch, RM ;
Mohajeri, MH ;
Papassotiropoulos, A .
HUMAN MOLECULAR GENETICS, 2004, 13 (01) :47-52
[8]   Predicting interpretability of metabolome models based on behavior, putative identity, and biological relevance of explanatory signals [J].
Enot, David P. ;
Beckmann, Manfred ;
Overy, David ;
Draper, John .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (40) :14865-14870
[9]   The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases [J].
Heidema, A. Geert ;
Boer, Jolanda Ma ;
Nagelkerke, Nico ;
Mariman, Edwin C. M. ;
van der A, Daphne L. ;
Feskens, Edith J. M. .
BMC GENETICS, 2006, 7 (1)
[10]   Association of severe respiratory syncytial virus bronchiolitis with interleukin-4 and interleukin-4 receptor α polymorphisms [J].
Hoebee, B ;
Rietveld, E ;
Bont, L ;
van Oosten, M ;
Hodemaekers, HM ;
Nagelkerke, NJD ;
Neijens, HJ ;
Kimpen, JLL ;
Kimman, TG .
JOURNAL OF INFECTIOUS DISEASES, 2003, 187 (01) :2-11