Superpopulation model inference for non probability samples under informative sampling with high-dimensional data

被引:0
|
作者
Liu, Zhan [1 ]
Wang, Dianni [1 ]
Pan, Yingli [1 ]
机构
[1] Hubei Univ, Sch Math & Stat, Hubei Key Lab Appl Math, Wuhan 430062, Peoples R China
关键词
Non probability samples; superpopulation model; informative sampling; high-dimensional data; variable selection;
D O I
10.1080/03610926.2024.2335543
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Non probability samples have been widely used in various fields. However, non probability samples suffer from selection biases due to the unknown selection probabilities. Superpopulation model inference methods have been discussed to solve this problem, but these approaches require the non informative sampling assumption. When the sampling mechanism is informative sampling, that is, selection probabilities are related to the outcome variable, the previous inference methods may be invalid. Moreover, we may encounter a large number of covariates in practice, which poses a new challenge for inference from non probability samples under informative sampling. In this article, the superpopulation model approaches under informative sampling with high-dimensional data are developed to perform valid inferences from non probability samples. Specifically, a semiparametric exponential tilting model is established to estimate selection probabilities, and the sample distribution is derived for estimating the superpopulation model parameters. Moreover, SCAD, adaptive LASSO, and Model-X knockoffs are employed to select variables, and estimate parameters in superpopulation modeling. Asymptotic properties of the proposed estimators are established. Results from simulation studies are presented to compare the performance of the proposed estimators with the naive estimator, which ignores informative sampling. The proposed methods are further applied to the National Health and Nutrition Examination Survey data.
引用
收藏
页码:1370 / 1390
页数:21
相关论文
共 50 条
  • [31] INFERENCE FOR CHANGE POINTS IN HIGH-DIMENSIONAL DATA VIA SELFNORMALIZATION
    Wang, Runmin
    Zhu, Changbo
    Volgushev, Stanislav
    Shao, Xiaofeng
    ANNALS OF STATISTICS, 2022, 50 (02): : 781 - 806
  • [32] On the challenges of learning with inference networks on sparse, high-dimensional data
    Krishnan, Rahul G.
    Liang, Dawen
    Hoffman, Matthew D.
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [33] On inference in high-dimensional logistic regression models with separated data
    Lewis, R. M.
    Battey, H. S.
    BIOMETRIKA, 2024, 111 (03)
  • [34] TEST FOR MEAN MATRIX IN GMANOVA MODEL UNDER HETEROSCEDASTICITY AND NON-NORMALITY FOR HIGH-DIMENSIONAL DATA
    Yamada, Takayuki
    Himeno, Tetsuto
    Tillander, Annika
    Pavlenko, Tatjana
    THEORY OF PROBABILITY AND MATHEMATICAL STATISTICS, 2023, : 129 - 158
  • [35] A U-classifier for high-dimensional data under non-normality
    Ahmad, M. Rauf
    Pavlenko, Tatjana
    JOURNAL OF MULTIVARIATE ANALYSIS, 2018, 167 : 269 - 283
  • [36] Clustering High-Dimensional Data via Random Sampling and Consensus
    Traganitis, Panagiotis A.
    Slavakis, Konstantinos
    Giannakis, Georgios B.
    2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 307 - 311
  • [37] Empirical likelihood inference for missing survey data under unequal probability sampling
    Cai, Song
    Rao, J. N. K.
    SURVEY METHODOLOGY, 2019, 45 (01) : 145 - 164
  • [38] MODEL-ASSISTED INFERENCE FOR COVARIATE-SPECIFIC TREATMENT EFFECTS WITH HIGH-DIMENSIONAL DATA
    Wu, Peng
    Tan, Zhiqiang
    Hu, Wenjie
    Zhou, Xiao-Hua
    STATISTICA SINICA, 2024, 34 (01) : 459 - 479
  • [39] Discriminant analysis of high-dimensional data over limited samples
    V. I. Serdobolskii
    Doklady Mathematics, 2010, 81 : 75 - 77
  • [40] DOUBLY DEBIASED LASSO: HIGH-DIMENSIONAL INFERENCE UNDER HIDDEN CONFOUNDING
    Guo, Zijian
    Cevid, Domagoj
    Buhlmann, Peter
    ANNALS OF STATISTICS, 2022, 50 (03): : 1320 - 1347