Models as Approximations II: A Model-Free Theory of Parametric Regression

被引:31
作者
Buja, Andreas [1 ,2 ,3 ]
Brown, Lawrence [3 ]
Kuchibhotla, Arun Kumar [3 ]
Berk, Richard [4 ]
George, Edward
Zhao, Linda [1 ,3 ]
机构
[1] First Pacific Co, Hong Kong, Peoples R China
[2] Univ Penn, Wharton Sch, Dept Stat, Stat, 400 Jon M Huntsman Hall,3730 Walnut St, Philadelphia, PA 19104 USA
[3] Univ Penn, Wharton Sch, Dept Stat, Stat, 400 Jon M Huntsman Hall,3730 Walnut St, Philadelphia, PA 19104 USA
[4] Univ Penn, Wharton Sch, Dept Stat, Criminol & Stat, 400 Jon M Huntsman Hall,3730 Walnut St, Philadelphia, PA 19104 USA
关键词
Ancillarity of regressors; misspecification; econometrics; sandwich estimator; bootstrap; bagging;
D O I
10.1214/18-STS694
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We develop a model-free theory of general types of parametric regression for i.i.d. observations. The theory replaces the parameters of parametric models with statistical functionals, to be called "regression functionals," defined on large nonparametric classes of joint x-y distributions, without assuming a correct model. Parametric models are reduced to heuristics to suggest plausible objective functions. An example of a regression functional is the vector of slopes of linear equations fitted by OLS to largely arbitrary x-y distributions, without assuming a linear model (see Part I). More generally, regression functionals can be defined by minimizing objective functions, solving estimating equations, or with ad hoc constructions. In this framework, it is possible to achieve the following: (1) define a notion of "wellspecification" for regression functionals that replaces the notion of correct specification of models, (2) propose a well-specification diagnostic for regression functionals based on reweighting distributions and data, (3) decompose sampling variability of regression functionals into two sources, one due to the conditional response distribution and another due to the regressor distribution interacting with misspecification, both of order N-1/2, (4) exhibit plug-in/sandwich estimators of standard error as limit cases of x-y bootstrap estimators, and (5) provide theoretical heuristics to indicate that x-y bootstrap standard errors may generally be preferred over sandwich estimators.
引用
收藏
页码:545 / 565
页数:21
相关论文
共 26 条
[1]  
[Anonymous], 1990, MONOGRAPHS STAT APPL
[2]  
Berk RA., 2008, I MATH STATIST COLLE, V2, P127, DOI DOI 10.1214/193940307000000428
[3]   VALID POST-SELECTION INFERENCE [J].
Berk, Richard ;
Brown, Lawrence ;
Buja, Andreas ;
Zhang, Kai ;
Zhao, Linda .
ANNALS OF STATISTICS, 2013, 41 (02) :802-837
[4]  
Bickel PJ, 1997, STAT SINICA, V7, P1
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]  
Buja A., 2001, SMOOTHING EFFECTS BA
[7]  
Buja A., 2019, SUPPLEMENT MODELS AP, DOI [10.1214/18-STS694SUPP, DOI 10.1214/18-STS694SUPP]
[8]  
Buja A., 2016, SMOOTHING EFFECTS BA
[9]  
Efron B., 1994, An introduction to the bootstrap, DOI 10.1007/978-1-4899-4541-9
[10]   Splitting a Predictor at the Upper Quarter or Third and the Lower Quarter or Third [J].
Gelman, Andrew ;
Park, David K. .
AMERICAN STATISTICIAN, 2009, 63 (01) :1-8