Genome-wide association analysis by lasso penalized logistic regression

被引:531
作者
Wu, Tong Tong [5 ]
Chen, Yi Fang [4 ]
Hastie, Trevor [3 ,4 ]
Sobel, Eric [1 ]
Lange, Kenneth [1 ,2 ]
机构
[1] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Biomath, Los Angeles, CA 90095 USA
[3] Stanford Univ, Dept Biostat, Stanford, CA 94305 USA
[4] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[5] Univ Maryland, Dept Epidemiol & Biostat, College Pk, MD 20742 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/bioinformatics/btp041
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: In ordinary regression, imposition of a lasso penalty makes continuous model selection straightforward. Lasso penalized regression is particularly advantageous when the number of predictors far exceeds the number of observations. Method: The present article evaluates the performance of lasso penalized logistic regression in case-control disease gene mapping with a large number of SNPs (single nucleotide polymorphisms) predictors. The strength of the lasso penalty can be tuned to select a predetermined number of the most relevant SNPs and other predictors. For a given value of the tuning constant, the penalized likelihood is quickly maximized by cyclic coordinate ascent. Once the most potent marginal predictors are identified, their two-way and higher order interactions can also be examined by lasso penalized logistic regression. Results: This strategy is tested on both simulated and real data. Our findings on coeliac disease replicate the previous SNP results and shed light on possible interactions among the SNPs.
引用
收藏
页码:714 / 721
页数:8
相关论文
共 50 条
  • [41] GWAS on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies
    Sikorska, Karolina
    Lesaffre, Emmanuel
    Groenen, Patrick F. J.
    Eilers, Paul H. C.
    BMC BIOINFORMATICS, 2013, 14
  • [42] GWAS on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies
    Karolina Sikorska
    Emmanuel Lesaffre
    Patrick FJ Groenen
    Paul HC Eilers
    BMC Bioinformatics, 14
  • [43] Spatiotemporal dynamics and genome-wide association genome-wide association analysis of desiccation tolerance in Drosophila melanogaster
    Rajpurohit, Subhash
    Gefen, Eran
    Bergland, Alan O.
    Petrov, Dmitri A.
    Gibbs, Allen G.
    Schmidt, Paul S.
    MOLECULAR ECOLOGY, 2018, 27 (17) : 3525 - 3540
  • [44] Optimal use of regression models in genome-wide association studies
    Powell, J. E.
    Kranis, A.
    Floyd, J.
    Dekkers, J. C. M.
    Knott, S.
    Haley, C. S.
    ANIMAL GENETICS, 2012, 43 (02) : 133 - 143
  • [45] Genome-wide pathway analysis of a genome-wide association study on Alzheimer's disease
    Lee, Young Ho
    Song, Gwan Gyu
    NEUROLOGICAL SCIENCES, 2015, 36 (01) : 53 - 59
  • [46] Genome-wide pathway analysis of a genome-wide association study on Alzheimer’s disease
    Young Ho Lee
    Gwan Gyu Song
    Neurological Sciences, 2015, 36 : 53 - 59
  • [47] METAINTER: meta-analysis of multiple regression models in genome-wide association studies
    Vaitsiakhovich, Tatsiana
    Drichel, Dmitriy
    Herold, Christine
    Lacour, Andre
    Becker, Tim
    BIOINFORMATICS, 2015, 31 (02) : 151 - 157
  • [48] Hybrid of Restricted and Penalized Maximum Likelihood Method for Efficient Genome-Wide Association Study
    Ren, Wenlong
    Liang, Zhikai
    He, Shu
    Xiao, Jing
    GENES, 2020, 11 (11) : 1 - 16
  • [49] An Analysis Pipeline for Genome-wide Association Studies
    Stefanov, Stefan
    Lautenberger, James
    Gold, Bert
    CANCER INFORMATICS, 2008, 6 : 455 - +
  • [50] Genome-wide association analysis on breastfeeding duration
    Colodro-Conde, Lucia
    Carland, Corinne
    Rajaei, Sheeva
    Paternoster, Lavinia
    Sanchez Romera, Juan F.
    Ordonana, Juan R.
    Lupton, Michelle
    Assimes, Themistocles L.
    Martin, Nicholas G.
    Medland, Sarah E.
    BEHAVIOR GENETICS, 2020, 50 (06) : 448 - 448