Doubly robust estimation for non-probability samples with modified intertwined probabilistic factors decoupling

被引:3
作者
Liu, Zhan [1 ]
Zheng, Junbo [1 ]
Pan, Yingli [1 ,2 ]
机构
[1] Hubei Univ, Sch Math & Stat, Hubei Key Lab Appl Math, Wuhan, Peoples R China
[2] Hubei Univ, Sch Math & Stat, Hubei Key Lab Appl Math, Wuhan 430062, Peoples R China
关键词
doubly robust estimation; high-dimensional data; IPAD; non-probability samples; FALSE DISCOVERY RATE; REGRESSION-COEFFICIENTS; VARIABLE SELECTION; INFERENCE;
D O I
10.1002/sam.11614
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, non-probability samples, such as web survey samples, have become increasingly popular in many fields, but they may be subject to selection biases, which results in the difficulty for inference from them. Doubly robust (DR) estimation is one of the approaches to making inferences from non-probability samples. When many covariates are available, variable selection becomes important in DR estimation. In this paper, a new DR estimator for the finite population mean is constructed, where the intertwined probabilistic factors decoupling (IPAD) and modified IPAD are used to select important variables in the propensity score model and the outcome superpopulation model, respectively. Unlike the traditional variable selection approaches, such as adaptive least absolute shrinkage and selection operator and smoothly clipped absolute deviations, IPAD and the modified IPAD not only can select important variables and estimate parameters, but also can control the false discovery rate, which can produce more accurate population estimators. Asymptotic theories and variance estimation of the DR estimator with a modified IPAD are established. Results from simulation studies indicate that our proposed estimator performs well. We apply the proposed method to the analysis of the Pew Research Center data and the Behavioral Risk Factor Surveillance System data.
引用
收藏
页码:224 / 236
页数:13
相关论文
共 32 条
[1]  
Baker R., 2013, Journal of survey statistics and methodology, V1, P90, DOI [10.1093/jssam/smt008, DOI 10.1093/JSSAM/SMT008]
[2]   CONTROLLING THE FALSE DISCOVERY RATE VIA KNOCKOFFS [J].
Barber, Rina Foygel ;
Candes, Emmanuel J. .
ANNALS OF STATISTICS, 2015, 43 (05) :2055-2085
[3]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[4]  
Brick J M., 2014, Proceedings of the Conference on beyond Traditional Survey Taking: Adapting to a Changing World, P1
[5]   Panning for gold: "model-X' knockoffs for high dimensional controlled variable selection [J].
Candes, Emmanuel ;
Fan, Yingying ;
Janson, Lucas ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2018, 80 (03) :551-577
[6]  
Chen Yutian, 2018, arXiv
[7]  
Dever J. A., 2014, PROC STAT CANADA S, P1
[8]   Inference for Nonprobability Samples [J].
Elliott, Michael R. ;
Valliant, Richard .
STATISTICAL SCIENCE, 2017, 32 (02) :249-264
[9]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[10]   IPAD: Stable Interpretable Forecasting with Knockoffs Inference [J].
Fan, Yingying ;
Lv, Jinchi ;
Sharifvaghefi, Mahrad ;
Uematsu, Yoshimasa .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (532) :1822-1834