Statistical data integration in survey sampling: a review

被引:44
|
作者
Yang, Shu [1 ]
Kim, Jae Kwang [2 ]
机构
[1] North Carolina State Univ, Dept Stat, Raleigh, NC USA
[2] Iowa State Univ, Dept Stat, Ames, IA 50011 USA
基金
美国国家科学基金会;
关键词
Generalizability; Meta-analysis; Missing at random; Transportability; PROPENSITY SCORE; COMBINING INFORMATION; MULTIPLE SURVEYS; GENERALIZING EVIDENCE; ROBUST ESTIMATION; CAUSAL INFERENCE; MISSING DATA; PROBABILITY; CALIBRATION; IMPUTATION;
D O I
10.1007/s42081-020-00093-w
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Finite population inference is a central goal in survey sampling. Probability sampling is the main statistical approach to finite population inference. Challenges arise due to high cost and increasing non-response rates. Data integration provides a timely solution by leveraging multiple data sources to provide more robust and efficient inference than using any single data source alone. The technique for data integration varies depending on types of samples and available information to be combined. This article provides a systematic review of data integration techniques for combining probability samples, probability and non-probability samples, and probability and big data samples. We discuss a wide range of integration methods such as generalized least squares, calibration weighting, inverse probability weighting, mass imputation, and doubly robust methods. Finally, we highlight important questions for future research.
引用
收藏
页码:625 / 650
页数:26
相关论文
共 50 条
  • [31] CALIBRATION ESTIMATORS IN SURVEY SAMPLING
    DEVILLE, JC
    SARNDAL, CE
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1992, 87 (418) : 376 - 382
  • [32] A Review of Statistical Analyses on Physical Activity Data Collected from Accelerometers
    Zhang, Yukun
    Li, Haocheng
    Keadle, Sarah Kozey
    Matthews, Charles E.
    Carroll, Raymond J.
    STATISTICS IN BIOSCIENCES, 2019, 11 (02) : 465 - 476
  • [33] A review of techniques for treating missing data in OM survey research
    Tsikriktsis, N
    JOURNAL OF OPERATIONS MANAGEMENT, 2005, 24 (01) : 53 - 62
  • [34] Robust statistical modelling using the multivariate skew t distribution with complete and incomplete data
    Lin, Tsung-I
    Lin, Tzy-Chy
    STATISTICAL MODELLING, 2011, 11 (03) : 253 - 277
  • [35] Integration of 3D and multispectral data for cultural heritage applications: Survey and perspectives
    Chane, Camille Simon
    Mansouri, Alamin
    Marzani, Franck S.
    Boochs, Frank
    IMAGE AND VISION COMPUTING, 2013, 31 (01) : 91 - 102
  • [36] Inference with non-probability samples and survey data integration: a science mapping study
    Salvatore, Camilla
    METRON-INTERNATIONAL JOURNAL OF STATISTICS, 2023, 81 (01): : 83 - 107
  • [37] Target Population Statistical Inference With Data Integration Across Multiple Sources-An Approach to Mitigate Information Shortage in Rare Disease Clinical Trials
    Li, Xihao
    Song, Yang
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2020, 12 (03): : 322 - 333
  • [38] A comprehensive survey of the approaches for pathway analysis using multi-omics data integration
    Maghsoudi, Zeynab
    Nguyen, Ha
    Tavakkoli, Alireza
    Nguyen, Tin
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (06)
  • [39] STATISTICAL METHODS FOR COST-EFFECTIVENESS ANALYSES THAT USE OBSERVATIONAL DATA: A CRITICAL APPRAISAL TOOL AND REVIEW OF CURRENT PRACTICE
    Kreif, Noemi
    Grieve, Richard
    Sadique, M. Zia
    HEALTH ECONOMICS, 2013, 22 (04) : 486 - 500
  • [40] Maximum entropy estimation for survey sampling
    Gamboa, Fabrice
    Loubes, Jean-Michel
    Rochet, Paul
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2011, 141 (01) : 305 - 317