Consequences of Model Misspecification for Maximum Likelihood Estimation with Missing Data

被引:5
|
作者
Golden, Richard M. [1 ]
Henley, Steven S. [2 ,3 ,4 ]
White, Halbert [5 ]
Kashner, T. Michael [3 ,4 ,6 ]
机构
[1] Univ Texas Dallas, Sch Behav & Brain Sci, GR4-1,800 Campbell Rd, Richardson, TX 75080 USA
[2] Martingale Res Corp, 101 E Pk Blvd,Suite 600, Plano, TX 75074 USA
[3] Loma Linda Univ, Sch Med, Dept Med, Loma Linda, CA 92357 USA
[4] VA Loma Linda Healthcare Syst, Ctr Adv Stat Educ, Loma Linda, CA 92357 USA
[5] Univ Calif San Diego, Dept Econ, La Jolla, CA 92093 USA
[6] Dept Vet Affairs, Off Acad Affiliat 10X1, 810 Vermont Ave NW, Washington, DC 20420 USA
关键词
asymptotic theory; ignorable; Generalized Information Matrix Test; misspecification; missing data; nonignorable; sandwich estimator; specification analysis; GENERALIZED LINEAR-MODELS; LONGITUDINAL BINARY DATA; MULTIPLE IMPUTATION; INFORMATION MATRIX; VERIFICATION BIAS; INCOMPLETE-DATA; COVARIATE DATA; INFERENCE; EM; IGNORABILITY;
D O I
10.3390/econometrics7030037
中图分类号
F [经济];
学科分类号
02 ;
摘要
Researchers are often faced with the challenge of developing statistical models with incomplete data. Exacerbating this situation is the possibility that either the researcher's complete-data model or the model of the missing-data mechanism is misspecified. In this article, we create a formal theoretical framework for developing statistical models and detecting model misspecification in the presence of incomplete data where maximum likelihood estimates are obtained by maximizing the observable-data likelihood function when the missing-data mechanism is assumed ignorable. First, we provide sufficient regularity conditions on the researcher's complete-data model to characterize the asymptotic behavior of maximum likelihood estimates in the simultaneous presence of both missing data and model misspecification. These results are then used to derive robust hypothesis testing methods for possibly misspecified models in the presence of Missing at Random (MAR) or Missing Not at Random (MNAR) missing data. Second, we introduce a method for the detection of model misspecification in missing data problems using recently developed Generalized Information Matrix Tests (GIMT). Third, we identify regularity conditions for the Missing Information Principle (MIP) to hold in the presence of model misspecification so as to provide useful computational covariance matrix estimation formulas. Fourth, we provide regularity conditions that ensure the observable-data expected negative log-likelihood function is convex in the presence of partially observable data when the amount of missingness is sufficiently small and the complete-data likelihood is convex. Fifth, we show that when the researcher has correctly specified a complete-data model with a convex negative likelihood function and an ignorable missing-data mechanism, then its strict local minimizer is the true parameter value for the complete-data model when the amount of missingness is sufficiently small. Our results thus provide new robust estimation, inference, and specification analysis methods for developing statistical models with incomplete data.
引用
收藏
页数:27
相关论文
共 50 条
  • [41] The Consequences of Model Misspecification for the Estimation of Nonlinear Interaction Effects
    Beiser-McGrath, Janina
    Beiser-McGrath, Liam F.
    POLITICAL ANALYSIS, 2023, 31 (02) : 278 - 287
  • [42] Handling missing data for causal effect estimation in cohort studies using Targeted Maximum Likelihood Estimation
    Dashti, Ghazaleh
    Lee, Katherine J.
    Simpson, Julie A.
    White, Ian R.
    Carlin, John B.
    Moreno-Betancur, Margarita
    INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2021, 50 : 55 - 55
  • [43] SIMPLIFIED MAXIMUM LIKELIHOOD INFERENCE BASED ON THE LIKELIHOOD DECOMPOSITION FOR MISSING DATA
    Jung, Sangah
    Park, Sangun
    AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2013, 55 (03) : 271 - 283
  • [44] Maximum Likelihood Estimation and Coarse Data
    Couso, Ines
    Dubois, Didier
    Huellermeier, Eyke
    SCALABLE UNCERTAINTY MANAGEMENT (SUM 2017), 2017, 10564 : 3 - 16
  • [45] MAXIMUM SMOOTHED LIKELIHOOD ESTIMATION AND SMOOTHED MAXIMUM LIKELIHOOD ESTIMATION IN THE CURRENT STATUS MODEL
    Groeneboom, Piet
    Jongbloed, Geurt
    Witte, Birgit I.
    ANNALS OF STATISTICS, 2010, 38 (01): : 352 - 387
  • [46] Maximum likelihood restoration of missing samples in sinusoidal data
    Abatzoglou, Theagenis
    Convery, Patrick
    2005 39th Asilomar Conference on Signals, Systems and Computers, Vols 1 and 2, 2005, : 942 - 945
  • [47] Assessment of maximum likelihood PCA missing data imputation
    Folch-Fortuny, Abel
    Arteaga, Francisco
    Ferrer, Alberto
    JOURNAL OF CHEMOMETRICS, 2016, 30 (07) : 386 - 393
  • [48] Maximum likelihood estimation with missing outcomes: From simplicity to complexity
    Baker, Stuart G.
    STATISTICS IN MEDICINE, 2019, 38 (22) : 4453 - 4474
  • [49] PSEUDO MAXIMUM-LIKELIHOOD ESTIMATION AND A TEST FOR MISSPECIFICATION IN MEAN AND COVARIANCE STRUCTURE MODELS
    ARMINGER, G
    SCHOENBERG, RJ
    PSYCHOMETRIKA, 1989, 54 (03) : 409 - 425
  • [50] Maximum likelihood estimation and forecasting of DNA sequence with missing values
    Gao Jie
    Xu Zhen-Yuan
    Zhang Li-Ting
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2007, 4 (7-8) : 1237 - 1242