Superpopulation model inference for non probability samples under informative sampling with high-dimensional data

被引:0
|
作者
Liu, Zhan [1 ]
Wang, Dianni [1 ]
Pan, Yingli [1 ]
机构
[1] Hubei Univ, Sch Math & Stat, Hubei Key Lab Appl Math, Wuhan 430062, Peoples R China
关键词
Non probability samples; superpopulation model; informative sampling; high-dimensional data; variable selection;
D O I
10.1080/03610926.2024.2335543
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Non probability samples have been widely used in various fields. However, non probability samples suffer from selection biases due to the unknown selection probabilities. Superpopulation model inference methods have been discussed to solve this problem, but these approaches require the non informative sampling assumption. When the sampling mechanism is informative sampling, that is, selection probabilities are related to the outcome variable, the previous inference methods may be invalid. Moreover, we may encounter a large number of covariates in practice, which poses a new challenge for inference from non probability samples under informative sampling. In this article, the superpopulation model approaches under informative sampling with high-dimensional data are developed to perform valid inferences from non probability samples. Specifically, a semiparametric exponential tilting model is established to estimate selection probabilities, and the sample distribution is derived for estimating the superpopulation model parameters. Moreover, SCAD, adaptive LASSO, and Model-X knockoffs are employed to select variables, and estimate parameters in superpopulation modeling. Asymptotic properties of the proposed estimators are established. Results from simulation studies are presented to compare the performance of the proposed estimators with the naive estimator, which ignores informative sampling. The proposed methods are further applied to the National Health and Nutrition Examination Survey data.
引用
收藏
页码:1370 / 1390
页数:21
相关论文
共 50 条
  • [1] Superpopulation model inference for non-probability samples under informative sampling
    Liu, Zhan
    Wang, Dianni
    Pan, Yingli
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024,
  • [2] Inference for non-probability samples under high-dimensional covariate-adjusted superpopulation model
    Pan, Yingli
    Cai, Wen
    Liu, Zhan
    STATISTICAL METHODS AND APPLICATIONS, 2022, 31 (04): : 955 - 979
  • [3] Inference for non-probability samples under high-dimensional covariate-adjusted superpopulation model
    Yingli Pan
    Wen Cai
    Zhan Liu
    Statistical Methods & Applications, 2022, 31 : 955 - 979
  • [4] Doubly robust inference when combining probability and non-probability samples with high dimensional data
    Yang, Shu
    Kim, Jae Kwang
    Song, Rui
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2020, 82 (02) : 445 - 465
  • [5] ASYMPTOTIC INFERENCE FOR HIGH-DIMENSIONAL DATA
    Kuelbs, Jim
    Vidyashankar, Anand N.
    ANNALS OF STATISTICS, 2010, 38 (02): : 836 - 869
  • [6] Model-Free Statistical Inference on High-Dimensional Data
    Guo, Xu
    Li, Runze
    Zhang, Zhe
    Zou, Changliang
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024,
  • [7] Inference for the case probability in high-dimensional logistic regression
    Guo, Zijian
    Rakshit, Prabrisha
    Herman, Daniel S.
    Chen, Jinbo
    Journal of Machine Learning Research, 2021, 22
  • [8] Maximally Informative Hierarchical Representations of High-Dimensional Data
    Ver Steeg, Greg
    Galstyan, Aram
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 1004 - 1012
  • [9] Differentially Private High-Dimensional Data Publication via Sampling-Based Inference
    Chen, Rui
    Xiao, Qian
    Zhang, Yu
    Xu, Jianliang
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 129 - 138
  • [10] Inference for High-Dimensional Streamed Longitudinal Data
    Senyuan Zheng
    Ling Zhou
    Acta Mathematica Sinica,English Series, 2025, (02) : 757 - 779