Weighted SNP Set Analysis in Genome-Wide Association Study

被引:5
作者
Dai, Hui [1 ]
Zhao, Yang [1 ]
Qian, Cheng [1 ]
Cai, Min [1 ]
Zhang, Ruyang [1 ]
Chu, Minjie [1 ]
Dai, Juncheng [1 ]
Hu, Zhibin [1 ,2 ,3 ]
Shen, Hongbing [1 ,2 ,3 ]
Chen, Feng [1 ]
机构
[1] Nanjing Med Univ, Sch Publ Hlth, Dept Epidemiol & Biostat, Nanjing, Jiangsu, Peoples R China
[2] Nanjing Med Univ, Ctr Canc, Jiangsu Key Lab Canc Biomarkers Prevent & Treatme, Clin Epidemiol Sect, Nanjing, Jiangsu, Peoples R China
[3] Nanjing Med Univ, State Key Lab Reprod Med, Nanjing, Jiangsu, Peoples R China
来源
PLOS ONE | 2013年 / 8卷 / 09期
基金
中国国家自然科学基金; 高等学校博士学科点专项科研基金;
关键词
MULTIPLE SNPS; SUSCEPTIBILITY; DISEASE; GENE; TESTS; LOCI;
D O I
10.1371/journal.pone.0075897
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Genome-wide association studies (GWAS) are popular for identifying genetic variants which are associated with disease risk. Many approaches have been proposed to test multiple single nucleotide polymorphisms (SNPs) in a region simultaneously which considering disadvantages of methods in single locus association analysis. Kernel machine based SNP set analysis is more powerful than single locus analysis, which borrows information from SNPs correlated with causal or tag SNPs. Four types of kernel machine functions and principal component based approach (PCA) were also compared. However, given the loss of power caused by low minor allele frequencies (MAF), we conducted an extension work on PCA and used a new method called weighted PCA (wPCA). Comparative analysis was performed for weighted principal component analysis (wPCA), logistic kernel machine based test (LKM) and principal component analysis (PCA) based on SNP set in the case of different minor allele frequencies (MAF) and linkage disequilibrium (LD) structures. We also applied the three methods to analyze two SNP sets extracted from a real GWAS dataset of non-small cell lung cancer in Han Chinese population. Simulation results show that when the MAF of the causal SNP is low, weighted principal component and weighted IBS are more powerful than PCA and other kernel machine functions at different LD structures and different numbers of causal SNPs. Application of the three methods to a real GWAS dataset indicates that wPCA and wIBS have better performance than the linear kernel, IBS kernel and PCA.
引用
收藏
页数:7
相关论文
共 25 条
  • [1] [Anonymous], 2006, QUANTO 1.1 A Computer Program for Power and Sample Size Calculations for Genetic-Epidemiology Studies
  • [2] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [3] Gene- or region-based association study via kernel principal component analysis
    Gao, Qingsong
    He, Yungang
    Yuan, Zhongshang
    Zhao, Jinghua
    Zhang, Bingbing
    Xue, Fuzhong
    [J]. BMC GENETICS, 2011, 12
  • [4] Testing association between disease and multiple SNPs in a candidate gene
    Gauderman, W. James
    Murcray, Cassandra
    Gilliland, Frank
    Conti, David V.
    [J]. GENETIC EPIDEMIOLOGY, 2007, 31 (05) : 383 - 395
  • [5] Sample size requirements for association studies of gene-gene interaction
    Gauderman, WJ
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2002, 155 (05) : 478 - 484
  • [6] Meta-analyses of genome-wide association studies identify multiple loci associated with pulmonary function
    Hancock, Dana B.
    Eijgelsheim, Mark
    Wilk, Jemma B.
    Gharib, Sina A.
    Loehr, Laura R.
    Marciante, Kristin D.
    Franceschini, Nora
    van Durme, Yannick M. T. A.
    Chen, Ting-hsu
    Barr, R. Graham
    Schabath, Matthew B.
    Couper, David J.
    Brusselle, Guy G.
    Psaty, Bruce M.
    van Duijn, Cornelia M.
    Rotter, Jerome I.
    Uitterlinden, Andre G.
    Hofman, Albert
    Punjabi, Naresh M.
    Rivadeneira, Fernando
    Morrison, Alanna C.
    Enright, Paul L.
    North, Kari E.
    Heckbert, Susan R.
    Lumley, Thomas
    Stricker, Bruno H. C.
    O'Connor, George T.
    London, Stephanie J.
    [J]. NATURE GENETICS, 2010, 42 (01) : 45 - U61
  • [7] A genome-wide association study identifies two new lung cancer susceptibility loci at 13q12.12 and 22q12.2 in Han Chinese
    Hu, Zhibin
    Wu, Chen
    Shi, Yongyong
    Guo, Huan
    Zhao, Xueying
    Yin, Zhihua
    Yang, Lei
    Dai, Juncheng
    Hu, Lingmin
    Tan, Wen
    Li, Zhiqiang
    Deng, Qifei
    Wang, Jiucun
    Wu, Wei
    Jin, Guangfu
    Jiang, Yue
    Yu, Dianke
    Zhou, Guoquan
    Chen, Hongyan
    Guan, Peng
    Chen, Yijiang
    Shu, Yongqian
    Xu, Lin
    Liu, Xiangyang
    Liu, Li
    Xu, Ping
    Han, Baohui
    Bai, Chunxue
    Zhao, Yuxia
    Zhang, Haibo
    Yan, Ying
    Ma, Hongxia
    Chen, Jiaping
    Chu, Mingjie
    Lu, Feng
    Zhang, Zhengdong
    Chen, Feng
    Wang, Xinru
    Jin, Li
    Lu, Jiachun
    Zhou, Baosen
    Lu, Daru
    Wu, Tangchun
    Lin, Dongxin
    Shen, Hongbing
    [J]. NATURE GENETICS, 2011, 43 (08) : 792 - U103
  • [8] On the synthesis and interpretation of consistent but weak gene-disease associations in the era of genome-wide association studies
    Khoury, Muin J.
    Little, Julian
    Gwinn, Marta
    Ioannidis, John P. A.
    [J]. INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2007, 36 (02) : 439 - 445
  • [9] Optimal tests for rare variant effects in sequencing association studies
    Lee, Seunggeun
    Wu, Michael C.
    Lin, Xihong
    [J]. BIOSTATISTICS, 2012, 13 (04) : 762 - 775
  • [10] Li Ang, 2005, Zhonghua Yi Xue Za Zhi, V85, P2623