Weighted SNP Set Analysis in Genome-Wide Association Study

被引:5
作者
Dai, Hui [1 ]
Zhao, Yang [1 ]
Qian, Cheng [1 ]
Cai, Min [1 ]
Zhang, Ruyang [1 ]
Chu, Minjie [1 ]
Dai, Juncheng [1 ]
Hu, Zhibin [1 ,2 ,3 ]
Shen, Hongbing [1 ,2 ,3 ]
Chen, Feng [1 ]
机构
[1] Nanjing Med Univ, Sch Publ Hlth, Dept Epidemiol & Biostat, Nanjing, Jiangsu, Peoples R China
[2] Nanjing Med Univ, Ctr Canc, Jiangsu Key Lab Canc Biomarkers Prevent & Treatme, Clin Epidemiol Sect, Nanjing, Jiangsu, Peoples R China
[3] Nanjing Med Univ, State Key Lab Reprod Med, Nanjing, Jiangsu, Peoples R China
基金
高等学校博士学科点专项科研基金; 中国国家自然科学基金;
关键词
MULTIPLE SNPS; SUSCEPTIBILITY; DISEASE; GENE; TESTS; LOCI;
D O I
10.1371/journal.pone.0075897
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Genome-wide association studies (GWAS) are popular for identifying genetic variants which are associated with disease risk. Many approaches have been proposed to test multiple single nucleotide polymorphisms (SNPs) in a region simultaneously which considering disadvantages of methods in single locus association analysis. Kernel machine based SNP set analysis is more powerful than single locus analysis, which borrows information from SNPs correlated with causal or tag SNPs. Four types of kernel machine functions and principal component based approach (PCA) were also compared. However, given the loss of power caused by low minor allele frequencies (MAF), we conducted an extension work on PCA and used a new method called weighted PCA (wPCA). Comparative analysis was performed for weighted principal component analysis (wPCA), logistic kernel machine based test (LKM) and principal component analysis (PCA) based on SNP set in the case of different minor allele frequencies (MAF) and linkage disequilibrium (LD) structures. We also applied the three methods to analyze two SNP sets extracted from a real GWAS dataset of non-small cell lung cancer in Han Chinese population. Simulation results show that when the MAF of the causal SNP is low, weighted principal component and weighted IBS are more powerful than PCA and other kernel machine functions at different LD structures and different numbers of causal SNPs. Application of the three methods to a real GWAS dataset indicates that wPCA and wIBS have better performance than the linear kernel, IBS kernel and PCA.
引用
收藏
页数:7
相关论文
共 25 条
[1]  
[Anonymous], 2006, QUANTO 1.1 A Computer Program for Power and Sample Size Calculations for Genetic-Epidemiology Studies
[2]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[3]   Gene- or region-based association study via kernel principal component analysis [J].
Gao, Qingsong ;
He, Yungang ;
Yuan, Zhongshang ;
Zhao, Jinghua ;
Zhang, Bingbing ;
Xue, Fuzhong .
BMC GENETICS, 2011, 12
[4]   Testing association between disease and multiple SNPs in a candidate gene [J].
Gauderman, W. James ;
Murcray, Cassandra ;
Gilliland, Frank ;
Conti, David V. .
GENETIC EPIDEMIOLOGY, 2007, 31 (05) :383-395
[5]   Sample size requirements for association studies of gene-gene interaction [J].
Gauderman, WJ .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 2002, 155 (05) :478-484
[6]   Meta-analyses of genome-wide association studies identify multiple loci associated with pulmonary function [J].
Hancock, Dana B. ;
Eijgelsheim, Mark ;
Wilk, Jemma B. ;
Gharib, Sina A. ;
Loehr, Laura R. ;
Marciante, Kristin D. ;
Franceschini, Nora ;
van Durme, Yannick M. T. A. ;
Chen, Ting-hsu ;
Barr, R. Graham ;
Schabath, Matthew B. ;
Couper, David J. ;
Brusselle, Guy G. ;
Psaty, Bruce M. ;
van Duijn, Cornelia M. ;
Rotter, Jerome I. ;
Uitterlinden, Andre G. ;
Hofman, Albert ;
Punjabi, Naresh M. ;
Rivadeneira, Fernando ;
Morrison, Alanna C. ;
Enright, Paul L. ;
North, Kari E. ;
Heckbert, Susan R. ;
Lumley, Thomas ;
Stricker, Bruno H. C. ;
O'Connor, George T. ;
London, Stephanie J. .
NATURE GENETICS, 2010, 42 (01) :45-U61
[7]   A genome-wide association study identifies two new lung cancer susceptibility loci at 13q12.12 and 22q12.2 in Han Chinese [J].
Hu, Zhibin ;
Wu, Chen ;
Shi, Yongyong ;
Guo, Huan ;
Zhao, Xueying ;
Yin, Zhihua ;
Yang, Lei ;
Dai, Juncheng ;
Hu, Lingmin ;
Tan, Wen ;
Li, Zhiqiang ;
Deng, Qifei ;
Wang, Jiucun ;
Wu, Wei ;
Jin, Guangfu ;
Jiang, Yue ;
Yu, Dianke ;
Zhou, Guoquan ;
Chen, Hongyan ;
Guan, Peng ;
Chen, Yijiang ;
Shu, Yongqian ;
Xu, Lin ;
Liu, Xiangyang ;
Liu, Li ;
Xu, Ping ;
Han, Baohui ;
Bai, Chunxue ;
Zhao, Yuxia ;
Zhang, Haibo ;
Yan, Ying ;
Ma, Hongxia ;
Chen, Jiaping ;
Chu, Mingjie ;
Lu, Feng ;
Zhang, Zhengdong ;
Chen, Feng ;
Wang, Xinru ;
Jin, Li ;
Lu, Jiachun ;
Zhou, Baosen ;
Lu, Daru ;
Wu, Tangchun ;
Lin, Dongxin ;
Shen, Hongbing .
NATURE GENETICS, 2011, 43 (08) :792-U103
[8]   On the synthesis and interpretation of consistent but weak gene-disease associations in the era of genome-wide association studies [J].
Khoury, Muin J. ;
Little, Julian ;
Gwinn, Marta ;
Ioannidis, John P. A. .
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2007, 36 (02) :439-445
[9]   Optimal tests for rare variant effects in sequencing association studies [J].
Lee, Seunggeun ;
Wu, Michael C. ;
Lin, Xihong .
BIOSTATISTICS, 2012, 13 (04) :762-775
[10]  
Li Ang, 2005, Zhonghua Yi Xue Za Zhi, V85, P2623