Pretest estimation in combining probability and non-probability samples

被引:1
作者
Gao, Chenyin [1 ]
Yang, Shu [1 ]
机构
[1] North Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
来源
ELECTRONIC JOURNAL OF STATISTICS | 2023年 / 17卷 / 01期
关键词
Data integration; dynamic borrowing; non-regularity; Pretest estimator; PSEUDO-LIKELIHOOD; REGRESSION; INFERENCE; NONRESPONSE; MODELS; INTEGRATION; ISSUES; WEAK;
D O I
10.1214/23-EJS2137
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Multiple heterogeneous data sources are becoming increasingly available for statistical analyses in the era of big data. As an important example in finite-population inference, we develop a unified framework of the test-and-pool approach to general parameter estimation by combining gold-standard probability and non-probability samples. We focus on the case when the study variable is observed in both datasets for estimating the target parameters, and each contains other auxiliary variables. Utilizing the probability design, we conduct a pretest procedure to determine the comparability of the non-probability data with the probability data and decide whether or not to leverage the non-probability data in a pooled analysis. When the probability and non-probability data are comparable, our approach combines both data for efficient estimation. Otherwise, we retain only the probability data for estimation. We also characterize the asymptotic distribution of the proposed test-and-pool estimator under a local alternative and provide a data-adaptive procedure to select the critical tuning parameters that target the smallest mean square error of the test -and-pool estimator. Lastly, to deal with the non-regularity of the test-and -pool estimator, we construct a robust confidence interval that has a good finite-sample coverage property.
引用
收藏
页码:1492 / 1546
页数:55
相关论文
共 50 条
  • [21] Inference for non-probability samples under high-dimensional covariate-adjusted superpopulation model
    Pan, Yingli
    Cai, Wen
    Liu, Zhan
    STATISTICAL METHODS AND APPLICATIONS, 2022, 31 (04) : 955 - 979
  • [22] Combining Probability and Nonprobability Samples on an Aggregated Level
    Aliste, Sofia F. Villalobos
    Scholtus, Sander
    de Waal, Ton
    JOURNAL OF OFFICIAL STATISTICS, 2025,
  • [23] Combining Inverse Probability Weighting and Multiple Imputation to Improve Robustness of Estimation
    Han, Peisong
    SCANDINAVIAN JOURNAL OF STATISTICS, 2016, 43 (01) : 246 - 260
  • [24] An Empirical Comparison of Methods to Produce Business Statistics Using Non-Probability Data
    Ang, Lyndon
    Clark, Robert
    Loong, Bronwyn
    Holmberg, Anders
    JOURNAL OF OFFICIAL STATISTICS, 2025, 41 (01) : 3 - 34
  • [25] Estimating General Parameters from Non-Probability Surveys Using Propensity Score Adjustment
    Castro-Martin, Luis
    Rueda, Maria del Mar
    Ferri-Garcia, Ramon
    MATHEMATICS, 2020, 8 (11) : 1 - 14
  • [26] A predictive model to estimate the pretest probability of metastasis in patients with osteosarcoma
    Wang, Sisheng
    Zheng, Shaoluan
    Hu, Kongzu
    Sun, Heyan
    Zhang, Jinling
    Rong, Genxiang
    Gao, Jie
    Ding, Nan
    Gui, Binjie
    MEDICINE, 2017, 96 (03)
  • [27] Combining Multiple Imputation and Inverse-Probability Weighting
    Seaman, Shaun R.
    White, Ian R.
    Copas, Andrew J.
    Li, Leah
    BIOMETRICS, 2012, 68 (01) : 129 - 137
  • [28] Inference from Non-Probability Surveys with Statistical Matching and Propensity Score Adjustment Using Modern Prediction Techniques
    Castro-Martin, Luis
    Rueda, Maria del Mar
    Ferri-Garcia, Ramon
    MATHEMATICS, 2020, 8 (06)
  • [29] Probability density estimation with surrogate data and validation sample
    Wang, Qihua
    Cui, Wenquan
    FRONTIERS OF MATHEMATICS IN CHINA, 2013, 8 (03) : 665 - 694
  • [30] Stable inverse probability weighting estimation for longitudinal studies
    Avagyan, Vahe
    Vansteelandt, Stijn
    SCANDINAVIAN JOURNAL OF STATISTICS, 2021, 48 (03) : 1046 - 1067