Inferences from logistic regression models in the presence of small samples, rare events, nonlinearity, and multicollinearity with observational data

被引:28
作者
Bergtold, Jason S. [1 ]
Yeager, Elizabeth A. [1 ]
Featherstone, Allen M. [1 ]
机构
[1] Kansas State Univ, Dept Agr Econ, 307 Waters Hall, Manhattan, KS 66506 USA
关键词
Logistic regression model; multicollinearity; nonlinearity; rare events; inference; small sample bias; C18; C35; C83; PRIMER;
D O I
10.1080/02664763.2017.1282441
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The logistic regression model has been widely used in the social and natural sciences and results from studies using this model can have significant policy impacts. Thus, confidence in the reliability of inferences drawn from these models is essential. The robustness of such inferences is dependent on sample size. The purpose of this article is to examine the impact of alternative data sets on the mean estimated bias and efficiency of parameter estimation and inference for the logistic regression model with observational data. A number of simulations are conducted examining the impact of sample size, nonlinear predictors, and multicollinearity on substantive inferences (e.g. odds ratios, marginal effects) when using logistic regression models. Findings suggest that small sample size can negatively affect the quality of parameter estimates and inferences in the presence of rare events, multicollinearity, and nonlinear predictor functions, but marginal effects estimates are relatively more robust to sample size.
引用
收藏
页码:528 / 546
页数:19
相关论文
共 26 条
[1]  
Akobundu E., 28 ANN M SOC MED DEC
[2]  
Arnold B., 1999, SPR S STAT
[3]  
Bergtold J.S., 2004, THESIS
[4]  
Bergtold JS, 2010, J CHOICE MODEL, V3, P1
[5]   Statistics in brief:: The importance of sample size in the planning and interpretation of medical research [J].
Biau, David Jean ;
Kerneis, Solen ;
Porcher, Raphael .
CLINICAL ORTHOPAEDICS AND RELATED RESEARCH, 2008, 466 (09) :2282-2288
[6]   Confidence intervals for multinomial logistic regression in sparse data [J].
Bull, Shelley B. ;
Lewinger, Juan Pablo ;
Lee, Sophia S. F. .
STATISTICS IN MEDICINE, 2007, 26 (04) :903-918
[7]  
Carroll R.J., 1990, 9043C PURD U DEP STA
[8]   A quadratic bootstrap method and improved estimation in logistic regression [J].
Claeskens, G ;
Aerts, M ;
Molenberghs, G .
STATISTICS & PROBABILITY LETTERS, 2003, 61 (04) :383-394
[9]  
Cox D. R., 1979, Theoretical Statistics
[10]   The effects of sampling strategies on the small sample properties of the logit estimator [J].
Dietrich, J .
JOURNAL OF APPLIED STATISTICS, 2005, 32 (06) :543-554