Structured variable selection with q-values

被引:8
作者
Garcia, Tanya P. [1 ]
Mueller, Samuel [2 ]
Carroll, Raymond J. [3 ]
Dunn, Tamara N. [4 ,5 ]
Thomas, Anthony P. [4 ,5 ]
Adams, Sean H. [4 ,5 ,6 ]
Pillai, Suresh D. [7 ,8 ]
Walzem, Rosemary L. [9 ]
机构
[1] Texas A&M Hlth Sci Ctr, Dept Epidemiol & Biostat, College Stn, TX 77843 USA
[2] Univ Sydney, Sch Math & Stat, Sydney, NSW 2006, Australia
[3] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
[4] Univ Calif Davis, Grad Grp Nutr Biol, Davis, CA 95616 USA
[5] Univ Calif Davis, Dept Nutr, Davis, CA 95616 USA
[6] USDA ARS, Obes & Metab Res Unit, Western Human Nutr Res Ctr, Davis, CA 95616 USA
[7] Texas A&M Univ, Dept Poultry Sci, College Stn, TX 77843 USA
[8] Texas A&M Univ, Dept Nutr & Food Sci, College Stn, TX 77843 USA
[9] Texas A&M Univ, Grad Fac Nutr, Dept Poultry Sci, College Stn, TX 77843 USA
基金
澳大利亚研究理事会;
关键词
False discovery rate; Microbial data; q-Values; Variable selection; Weighted Lasso; FALSE DISCOVERY RATE; MODEL SELECTION; VALIDATION; REGRESSION;
D O I
10.1093/biostatistics/kxt012
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
When some of the regressors can act on both the response and other explanatory variables, the already challenging problem of selecting variables when the number of covariates exceeds the sample size becomes more difficult. A motivating example is a metabolic study in mice that has diet groups and gut microbial percentages that may affect changes in multiple phenotypes related to body weight regulation. The data have more variables than observations and diet is known to act directly on the phenotypes as well as on some or potentially all of the microbial percentages. Interest lies in determining which gut microflora influence the phenotypes while accounting for the direct relationship between diet and the other variables. A new methodology for variable selection in this context is presented that links the concept of q-values from multiple hypothesis testing to the recently developed weighted Lasso.
引用
收藏
页码:695 / 707
页数:13
相关论文
共 32 条
  • [1] Diets Enriched in Oat Bran or Wheat Bran Temporally and Differentially Alter the Composition of the Fecal Community of Rats
    Abnous, Khalil
    Brooks, Stephen P. J.
    Kwan, Judy
    Matias, Fernando
    Green-Johnson, Julia
    Selinger, L. Brent
    Thomas, Matthew
    Kalmokoff, Martin
    [J]. JOURNAL OF NUTRITION, 2009, 139 (11) : 2024 - 2031
  • [2] [Anonymous], 2006, Journal of the Royal Statistical Society, Series B
  • [3] Benjamini Y, 2001, ANN STAT, V29, P1165
  • [4] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [5] Weighted Lasso with Data Integration
    Bergersen, Linn Cecilie
    Glad, Ingrid K.
    Lyng, Heidi
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2011, 10 (01)
  • [6] Effect of Dietary Protein Content on Weight Gain, Energy Expenditure, and Body Composition During Overeating A Randomized Controlled Trial
    Bray, George A.
    Smith, Steven R.
    de Jonge, Lilian
    Xie, Hui
    Rood, Jennifer
    Martin, Corby K.
    Most, Marlene
    Brock, Courtney
    Mancuso, Susan
    Redman, Leanne M.
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2012, 307 (01): : 47 - 55
  • [7] Weighted-LASSO for Structured Network Inference from Time Course Data
    Charbonnier, Camille
    Chiquet, Julien
    Ambroise, Christophe
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2010, 9 (01)
  • [8] Dowd SE, 2008, BMC MICROBIOL, V8, DOI [10.1186/1471-2180-8-125, 10.1186/1471-2180-8-43]
  • [9] Least angle regression - Rejoinder
    Efron, B
    Hastie, T
    Johnstone, I
    Tibshirani, R
    [J]. ANNALS OF STATISTICS, 2004, 32 (02) : 494 - 499
  • [10] Correlation and large-scale simultaneous significance testing
    Efron, Bradley
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (477) : 93 - 103