Model-Assisted Estimation Through Random Forests in Finite Population Sampling

被引:19
作者
Dagdoug, Mehdi [1 ]
Goga, Camelia [1 ]
Haziza, David [2 ]
机构
[1] Univ Bourgogne Franche Comte, Lab Math Besancon, Besancon, France
[2] Univ Ottawa, Dept Math & Stat, 150 Louis Pasteur Private, Ottawa, ON K1N 6N5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Model-assisted approach; Model-calibration; Nonparametric regression; Random forest; Survey data; Variance estimation; ASYMPTOTIC CONFIDENCE BANDS; AUXILIARY INFORMATION; VARIANCE REDUCTION; SURVEY DESIGN; APPROXIMATION;
D O I
10.1080/01621459.2021.1987250
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In surveys, the interest lies in estimating finite population parameters such as population totals and means. In most surveys, some auxiliary information is available at the estimation stage. This information may be incorporated in the estimation procedures to increase their precision. In this article, we use random forests (RFs) to estimate the functional relationship between the survey variable and the auxiliary variables. In recent years, RFs have become attractive as National Statistical Offices have now access to a variety of data sources, potentially exhibiting a large number of observations on a large number of variables. We establish the theoretical properties of model-assisted procedures based on RFs and derive corresponding variance estimators. A model-calibration procedure for handling multiple survey variables is also discussed. The results of a simulation study suggest that the proposed point and estimation procedures perform well in terms of bias, efficiency and coverage of normal-based confidence intervals, in a wide variety of settings. Finally, we apply the proposed methods using data on radio audiences collected by Mediametrie, a French audience company. Supplementary materials for this article are available online.
引用
收藏
页码:1234 / 1251
页数:18
相关论文
共 50 条
  • [31] Efficient Estimators of Finite Population Mean Based on Extreme Values in Simple Random Sampling
    Iftikhar, Anum
    Shi, Hongbo
    Hussain, Saddam
    Abbas, Mohsin
    Ullah, Kalim
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [32] Geographically weighted regression model-calibration for finite population parameter estimation under two stage sampling design
    Saha, Bappa
    Biswas, Ankur
    Ahmad, Tauqueer
    Misra Sahoo, Prachi
    Aditya, Kaustav
    Paul, Nobin Chandra
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024,
  • [33] Efficient estimation of population mean under stratified random sampling with linear cost function
    Mradula
    Yadav, Subhash Kumar
    Varshney, Rahul
    Dube, Madhulika
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (12) : 4364 - 4387
  • [34] Estimation of Population Variance in Two-Phase Sampling in Presence of Random Non - Response
    Bandyopadhyay, A.
    Singh, G. N.
    PAKISTAN JOURNAL OF STATISTICS AND OPERATION RESEARCH, 2015, 11 (04) : 525 - 542
  • [35] ENHANCED ESTIMATION OF POPULATION VARIANCE UNDER SIMPLE RANDOM SAMPLING WITH AN APPLICATION TO REAL DATA
    Kumar, Anoop
    Suhail, Rida
    Katara, Shubhra
    INTERNATIONAL JOURNAL OF AGRICULTURAL AND STATISTICAL SCIENCES, 2024, 20 (01): : 87 - 96
  • [36] Finite population estimation under generalized linear model assistance
    Marina Rondon, Luz
    Hernando Vanegas, Luis
    Ferraz, Cristiano
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (03) : 680 - 697
  • [37] Improved Ratio and Product Exponential type Estimators for Finite Population Mean in Stratified Random Sampling
    Yadav, Rohini
    Upadhyaya, Lakshmi N.
    Singh, Housila P.
    Chatterjee, S.
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2014, 43 (15) : 3269 - 3285
  • [38] ESTIMATION OF RATIO AND PRODUCT OF 2 FINITE POPULATION MEANS IN 2-PHASE SAMPLING
    SINGH, VK
    SINGH, HP
    SINGH, HP
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1994, 41 (02) : 163 - 171
  • [39] Some calibration estimators for finite population mean in two-stage stratified random sampling
    Singh, Dhirendra
    Sisodia, Bhupendra Veer Singh
    Nidhi
    Pundir, Sandeep
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2020, 49 (17) : 4234 - 4247
  • [40] Calibration estimation of population variance under stratified successive sampling in presence of random non response
    Singh, G. N.
    Bhattacharyya, D.
    Bandyopadhyay, A.
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (19) : 4487 - 4509