Semiparametric model averaging prediction for dichotomous response

被引:29
作者
Fang, Fang [1 ]
Li, Jialiang [2 ]
Xia, Xiaochao [3 ]
机构
[1] East China Normal Univ, Fac Econ & Management, Key Lab Adv Theory & Applicat Stat & Data Sci, MOE, Shanghai 200241, Peoples R China
[2] Natl Univ Singapore, Duke NUS Grad Med Sch, Dept Stat & Appl Probabil, Singapore 117546, Singapore
[3] Chongqing Univ, Coll Math & Stat, Chongqing 401331, Peoples R China
基金
中国国家自然科学基金;
关键词
Kullback-Leibler loss; Mis-specification; Model averaging; Semiparametric model; Splines basis; REGRESSION; SELECTION; LIKELIHOOD; INFERENCE;
D O I
10.1016/j.jeconom.2020.09.008
中图分类号
F [经济];
学科分类号
02 ;
摘要
Model averaging has attracted abundant attentions in the past decades as it emerges as an impressive forecasting device in econometrics, social sciences and medicine. So far most developed model averaging methods focus only on either parametric models or nonparametric models with a continuous response. In this paper, we propose a semiparametric model averaging prediction (SMAP) method for a dichotomous response. The idea is to approximate the unknown score function by a linear combination of one-dimensional marginal score functions. The weight parameters involved in the approximation are obtained by first smoothing the nonparametric marginal scores and then applying the parametric model averaging via a maximum likelihood estimation. The proposed SMAP provides greater flexibility than parametric models while being more stable than a fully nonparametric approach. Theoretical properties are investigated in two practical scenarios: (i) covariates are conditionally independent given the response; and (ii) the conditional independence assumption does not hold. In the first scenario, we show that SMAP puts weight one to the true model and hence the model averaging estimators are consistent. In the second scenario in which a true " model may not exist, SMAP is shown to be asymptotically optimal in the sense that its Kullback-Leibler loss is asymptotically identical to that of the best - but infeasible - model averaging estimator. Empirical evidences from simulation studies and a real data analysis are presented to support and illustrate our methods. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:219 / 245
页数:27
相关论文
共 41 条
[1]   A WEIGHT-RELAXED MODEL AVERAGING APPROACH FOR HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS [J].
Ando, Tomohiro ;
Li, Ker-Chau .
ANNALS OF STATISTICS, 2017, 45 (06) :2654-2679
[2]   A Model-Averaging Approach for High-Dimensional Regression [J].
Ando, Tomohiro ;
Li, Ker-Chau .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (505) :254-265
[3]  
[Anonymous], 2001, Applied Mathematical Sciences
[4]   Model selection: An integral part of inference [J].
Buckland, ST ;
Burnham, KP ;
Augustin, NH .
BIOMETRICS, 1997, 53 (02) :603-618
[5]   Semiparametric Ultra-High Dimensional Model Averaging of Nonlinear Dynamic Time Series [J].
Chen, Jia ;
Li, Degui ;
Linton, Oliver ;
Lu, Zudi .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (522) :919-932
[6]   An asymptotic theory for model selection inference in general semiparametric problems [J].
Claeskens, Gerda ;
Carroll, Raymond J. .
BIOMETRIKA, 2007, 94 (02) :249-265
[7]   Model-Free Feature Screening for Ultrahigh Dimenssional Discriminant Analysis [J].
Cui, Hengjian ;
Li, Runze ;
Zhong, Wei .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (510) :630-641
[8]   Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification [J].
Fan, Jianqing ;
Feng, Yang ;
Jiang, Jiancheng ;
Tong, Xin .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (513) :275-287
[9]   Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models [J].
Fan, Jianqing ;
Feng, Yang ;
Song, Rui .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (494) :544-557
[10]   Nonconcave penalized likelihood with a diverging number of parameters [J].
Fan, JQ ;
Peng, H .
ANNALS OF STATISTICS, 2004, 32 (03) :928-961