Pretest estimation in combining probability and non-probability samples

被引:1
作者
Gao, Chenyin [1 ]
Yang, Shu [1 ]
机构
[1] North Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
来源
ELECTRONIC JOURNAL OF STATISTICS | 2023年 / 17卷 / 01期
关键词
Data integration; dynamic borrowing; non-regularity; Pretest estimator; PSEUDO-LIKELIHOOD; REGRESSION; INFERENCE; NONRESPONSE; MODELS; INTEGRATION; ISSUES; WEAK;
D O I
10.1214/23-EJS2137
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Multiple heterogeneous data sources are becoming increasingly available for statistical analyses in the era of big data. As an important example in finite-population inference, we develop a unified framework of the test-and-pool approach to general parameter estimation by combining gold-standard probability and non-probability samples. We focus on the case when the study variable is observed in both datasets for estimating the target parameters, and each contains other auxiliary variables. Utilizing the probability design, we conduct a pretest procedure to determine the comparability of the non-probability data with the probability data and decide whether or not to leverage the non-probability data in a pooled analysis. When the probability and non-probability data are comparable, our approach combines both data for efficient estimation. Otherwise, we retain only the probability data for estimation. We also characterize the asymptotic distribution of the proposed test-and-pool estimator under a local alternative and provide a data-adaptive procedure to select the critical tuning parameters that target the smallest mean square error of the test -and-pool estimator. Lastly, to deal with the non-regularity of the test-and -pool estimator, we construct a robust confidence interval that has a good finite-sample coverage property.
引用
收藏
页码:1492 / 1546
页数:55
相关论文
共 50 条
  • [1] Combining non-probability and probability survey samples through mass imputation
    Kim, Jae Kwang
    Park, Seho
    Chen, Yilin
    Wu, Changbao
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2021, 184 (03) : 941 - 963
  • [2] Doubly robust inference when combining probability and non-probability samples with high dimensional data
    Yang, Shu
    Kim, Jae Kwang
    Song, Rui
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2020, 82 (02) : 445 - 465
  • [3] Kernel Weighting for blending probability and non-probability survey samples
    del Mar Rueda, Maria
    Cobo, Beatriz
    Rueda-Sanchez, Jorge Luis
    Ferri-Garcia, Ramon
    Castro-Martin, Luis
    SORT-STATISTICS AND OPERATIONS RESEARCH TRANSACTIONS, 2024, 48 (01) : 93 - 124
  • [4] Doubly robust estimation for non-probability samples with heterogeneity
    Liu, Zhan
    Sun, Yi
    Li, Yong
    Li, Yuanmeng
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2025, 465
  • [5] Integrating probability and big non-probability samples data to produce Official Statistics
    Golini, Natalia
    Righi, Paolo
    STATISTICAL METHODS AND APPLICATIONS, 2024, 33 (02) : 555 - 580
  • [6] Dealing with undercoverage for non-probability survey samples
    Chen, Yilin
    Li, Pengfei
    Wu, Changbao
    SURVEY METHODOLOGY, 2023, 49 (02)
  • [7] The R package NonProbEst for estimation in non-probability surveys
    Rueda, M.
    Ferri-Garcia, R.
    Castro, L.
    R JOURNAL, 2020, 12 (01): : 406 - 418
  • [8] Probability and non-probability samples: Improving regression modeling by using data from different sources
    Tutz, Gerhard
    INFORMATION SCIENCES, 2023, 621 : 424 - 436
  • [9] Doubly robust estimation for non-probability samples with modified intertwined probabilistic factors decoupling
    Liu, Zhan
    Zheng, Junbo
    Pan, Yingli
    STATISTICAL ANALYSIS AND DATA MINING, 2023, 16 (03) : 224 - 236
  • [10] Handling non-probability samples through inverse probability weighting with an application to Statistics Canada's crowdsourcing data
    Beaumont, Jean-Francois
    Bosa, Keven
    Brennan, Andrew
    Charlebois, Joanne
    Chu, Kenneth
    SURVEY METHODOLOGY, 2024, 50 (01)