Pretest estimation in combining probability and non-probability samples

被引:1
作者
Gao, Chenyin [1 ]
Yang, Shu [1 ]
机构
[1] North Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
来源
ELECTRONIC JOURNAL OF STATISTICS | 2023年 / 17卷 / 01期
关键词
Data integration; dynamic borrowing; non-regularity; Pretest estimator; PSEUDO-LIKELIHOOD; REGRESSION; INFERENCE; NONRESPONSE; MODELS; INTEGRATION; ISSUES; WEAK;
D O I
10.1214/23-EJS2137
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Multiple heterogeneous data sources are becoming increasingly available for statistical analyses in the era of big data. As an important example in finite-population inference, we develop a unified framework of the test-and-pool approach to general parameter estimation by combining gold-standard probability and non-probability samples. We focus on the case when the study variable is observed in both datasets for estimating the target parameters, and each contains other auxiliary variables. Utilizing the probability design, we conduct a pretest procedure to determine the comparability of the non-probability data with the probability data and decide whether or not to leverage the non-probability data in a pooled analysis. When the probability and non-probability data are comparable, our approach combines both data for efficient estimation. Otherwise, we retain only the probability data for estimation. We also characterize the asymptotic distribution of the proposed test-and-pool estimator under a local alternative and provide a data-adaptive procedure to select the critical tuning parameters that target the smallest mean square error of the test -and-pool estimator. Lastly, to deal with the non-regularity of the test-and -pool estimator, we construct a robust confidence interval that has a good finite-sample coverage property.
引用
收藏
页码:1492 / 1546
页数:55
相关论文
共 50 条
  • [31] Probability Samples Provide a Means of Benchmarking and Adjusting for Data Collected From Nonprobability Samples
    Elliott, Michael R.
    AMERICAN JOURNAL OF PUBLIC HEALTH, 2023, 113 (07) : 721 - 723
  • [32] Probability of default estimation, with a reject option
    Coenen, Lize
    Abdullah, Ahmed K. A.
    Guns, Tias
    2020 IEEE 7TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2020), 2020, : 439 - 448
  • [33] INTEGRATING PROBABILITY AND NONPROBABILITY SAMPLES FOR SURVEY INFERENCE
    Wisniowski, Arkadiusz
    Sakshaug, Joseph W.
    Ruiz, Diego Andres Perez
    Blom, Annelies G.
    JOURNAL OF SURVEY STATISTICS AND METHODOLOGY, 2020, 8 (01) : 120 - 147
  • [34] Rare Event Probability Estimation in the Presence of Epistemic Uncertainty on Input Probability Distribution Parameters
    Balesdent, Mathieu
    Morio, Jerome
    Brevault, Loic
    METHODOLOGY AND COMPUTING IN APPLIED PROBABILITY, 2016, 18 (01) : 197 - 216
  • [35] A novel estimation method for failure-probability-based-sensitivity by conditional probability theorem
    He, Liangli
    Lu, Zhenzhou
    Feng, Kaixuan
    STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2020, 61 (04) : 1589 - 1602
  • [36] Estimation of the Short-Term Probability of Failure in Water Mains
    Chik, Li
    Albrecht, David
    Kodikara, Jayantha
    JOURNAL OF WATER RESOURCES PLANNING AND MANAGEMENT, 2017, 143 (02)
  • [37] Risk Estimation with a Time-Varying Probability of Zero Returns*
    Sucarrat, Genaro
    Gronneberg, Steffen
    JOURNAL OF FINANCIAL ECONOMETRICS, 2022, 20 (02) : 278 - 309
  • [38] Combining Probability Distributions by Multiplication in Metrology: A Viable Method?
    Grientschnig, Dieter
    Lira, Ignacio
    INTERNATIONAL STATISTICAL REVIEW, 2014, 82 (03) : 392 - 410
  • [39] Coronary calcification improves the estimation for clinical likelihood of obstructive coronary artery disease and avoids unnecessary testing in patients with borderline pretest probability
    Zhou, Jia
    Zhao, Jia
    Li, Zhaoying
    Cong, Hongliang
    Wang, Chengjian
    Zhang, Hong
    Wang, Xing
    Ma, Yanhe
    Li, Chunjie
    Guo, Zhigang
    EUROPEAN JOURNAL OF PREVENTIVE CARDIOLOGY, 2022, 29 (03) : E105 - E107
  • [40] Bayesian Integration of Probability and Nonprobability Samples for Logistic Regression
    Salvatore, Camilla
    Biffignandi, Silvia
    Sakshaug, Joseph W.
    Wisniowski, Arkadiusz
    Struminskaya, Bella
    JOURNAL OF SURVEY STATISTICS AND METHODOLOGY, 2024, 12 (02) : 458 - 492