Variable selection in semi-parametric models

被引:3
|
作者
Zhang, Hongmei [1 ]
Maity, Arnab [2 ]
Arshad, Hasan [3 ,4 ]
Holloway, John [5 ]
Karmaus, Wilfried [1 ]
机构
[1] Univ Memphis, Sch Publ Hlth, Div Epidemiol Biostat & Environm Hlth, Memphis, TN 38152 USA
[2] North Carolina State Univ, Dept Stat, Raleigh, NC USA
[3] St Marys Hosp, David Hide Asthma & Allergy Res Ctr, Isle Of Wight, England
[4] Univ Southampton, Allergy & Clin Immunol, Southampton, Hants, England
[5] Univ Southampton, Fac Med, Southampton, Hants, England
关键词
Bayesian methods; Gaussian kernel; non-linear effects; partially linear regression; probit regression; reproducing kernel; variable selection; ENVIRONMENTAL TOBACCO-SMOKE; MATERNAL SMOKING; ORACLE PROPERTIES; DNA METHYLATION; REGRESSION; PREGNANCY; CHILDREN; ASTHMA; ASSOCIATION; STRATEGIES;
D O I
10.1177/0962280213499679
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
We propose Bayesian variable selection methods in semi-parametric models in the framework of partially linear Gaussian and problit regressions. Reproducing kernels are utilized to evaluate possibly non-linear joint effect of a set of variables. Indicator variables are introduced into the reproducing kernels for the inclusion or exclusion of a variable. Different scenarios based on posterior probabilities of including a variable are proposed to select important variables. Simulations are used to demonstrate and evaluate the methods. It was found that the proposed methods can efficiently select the correct variables regardless of the feature of the effects, linear or non-linear in an unknown form. The proposed methods are applied to two real data sets to identify cytosine phosphate guanine methylation sites associated with maternal smoking and cytosine phosphate guanine sites associated with cotinine levels with creatinine levels adjusted. The selected methylation sites have the potential to advance our understanding of the underlying mechanism for the impact of smoking exposure on health outcomes, and consequently benefit medical research in disease intervention.
引用
收藏
页码:1736 / 1752
页数:17
相关论文
共 50 条
  • [1] Unified variable selection in semi-parametric models
    Terry, William
    Zhang, Hongmei
    Maity, Arnab
    Arshad, Hasan
    Karmaus, Wilfried
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2017, 26 (06) : 2821 - 2831
  • [2] Variable selection in finite mixture of semi-parametric regression models
    Ormoz, Ehsan
    Eskandari, Farzad
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2016, 45 (03) : 695 - 711
  • [3] Variable selection and estimation for semi-parametric multiple-index models
    Wang, Tao
    Xu, Peirong
    Zhu, Lixing
    BERNOULLI, 2015, 21 (01) : 242 - 275
  • [4] Modern variable selection for longitudinal semi-parametric models with missing data
    Kowalski, J.
    Hao, S.
    Chen, T.
    Liang, Y.
    Liu, J.
    Ge, L.
    Feng, C.
    Tu, X. M.
    JOURNAL OF APPLIED STATISTICS, 2018, 45 (14) : 2548 - 2562
  • [5] A semi-parametric estimator for censored selection models with endogeneity
    Lee, MJ
    Vella, F
    JOURNAL OF ECONOMETRICS, 2006, 130 (02) : 235 - 252
  • [6] Instrumental variable estimation in semi-parametric additive hazards models
    Brueckner, Matthias
    Titman, Andrew
    Jaki, Thomas
    BIOMETRICS, 2019, 75 (01) : 110 - 120
  • [7] Semi-parametric copula sample selection models for count responses
    Marra, Giampiero
    Wyszynski, Karol
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2016, 104 : 110 - 129
  • [8] Selection of covariance patterns for longitudinal data in semi-parametric models
    Li, Jialiang
    Wong, Weng Kee
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2010, 19 (02) : 183 - 196
  • [9] Robust signed-rank estimation and variable selection for semi-parametric additive partial linear models
    Nguelifack, Brice M.
    Kemajou-Brown, Isabelle
    JOURNAL OF APPLIED STATISTICS, 2020, 47 (10) : 1794 - 1819
  • [10] Semi-parametric regression models and economies of scale in the presence of an endogenous variable
    Cohen, Jeffrey P.
    Osleeb, Jeffrey P.
    Yang, Ke
    REGIONAL SCIENCE AND URBAN ECONOMICS, 2014, 49 : 252 - 261