Using model-assisted calibration methods to improve efficiency of regression analyses using two-phase samples or pooled samples under complex survey designs

被引:0
作者
Wang, Lingxiao [1 ,2 ]
机构
[1] Univ Virginia, Dept Stat, 148 Amphitheater Way, Charlottesville, VA 22903 USA
[2] NCI, Biostat Branch, Div Canc Epidemiol & Genet, Rockville, MD 20850 USA
基金
美国国家卫生研究院;
关键词
calibration; complex survey data analysis; data integration; regression analysis; two-phase design; COHORT; RISK; ESTIMATORS; GLUCOSE; DISEASE;
D O I
10.1093/biomtc/ujaf092
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Two-phase sampling designs are frequently applied in epidemiological studies and large-scale health surveys. In such designs, certain variables are collected exclusively within a second-phase random subsample of the initial first-phase sample, often due to factors such as high costs, response burden, or constraints on data collection or assessment. Consequently, second-phase sample estimators can be inefficient due to the diminished sample size. Model-assisted calibration methods have been used to improve the efficiency of second-phase estimators in regression analysis. However, limited literature provides valid finite population inferences of the calibration estimators that use appropriate calibration auxiliary variables while simultaneously accounting for the complex sample designs in the first- and second-phase samples. Moreover, no literature considers the "pooled design" where some covariates are measured exclusively in certain repeated survey cycles. This paper proposes calibrating the sample weights for the second-phase sample to the weighted first-phase sample based on score functions of the regression model that uses predictions of the second-phase variable for the first-phase sample. We establish the consistency of estimation using calibrated weights and provide variance estimation for the regression coefficients under the two-phase design or the pooled design nested within complex survey designs. Empirical evidence highlights the efficiency and robustness of the proposed calibration compared to existing calibration and imputation methods. Data examples from the National Health and Nutrition Examination Survey are provided.
引用
收藏
页数:12
相关论文
共 24 条
[1]  
Amer Diabet Assoc, 2014, DIABETES CARE, V37, pS81, DOI [10.2337/dc10-S011, 10.2337/dc10-S062, 10.2337/dc13-S067, 10.2337/dc12-s011, 10.2337/dc13-S011, 10.2337/dc14-S081, 10.2337/dc11-S011, 10.2337/dc11-S062, 10.2337/dc12-s064]
[2]   Three-phase generalized raking and multiple imputation estimators to address error-prone data [J].
Amorim, Gustavo ;
Tao, Ran ;
Lotspeich, Sarah ;
Shaw, Pamela A. ;
Lumley, Thomas ;
Patel, Rena C. ;
Shepherd, Bryan E. .
STATISTICS IN MEDICINE, 2024, 43 (02) :379-394
[3]   Nine-year incident diabetes is predicted by fatty liver indices: the French DESIR study [J].
Balkau, Beverley ;
Lange, Celine ;
Vol, Sylviane ;
Fumeron, Frederic ;
Bonnet, Fabrice .
BMC GASTROENTEROLOGY, 2010, 10
[4]   Risk of cardiovascular and all-cause mortality in individuals with diabetes mellitus, impaired fasting glucose, and impaired glucose tolerance - The Australian diabetes, obesity, and lifestyle study (AusDiab) [J].
Barr, Elizabeth L. M. ;
Zimmet, Paul Z. ;
Welborn, Timothy A. ;
Jolley, Damien ;
Magliano, Dianna J. ;
Dunstan, David W. ;
Cameron, Adrian J. ;
Dwyer, Terry ;
Taylor, Hugh R. ;
Tonkin, Andrew M. ;
Wong, Tien Y. ;
McNeil, John ;
Shaw, Jonathan E. .
CIRCULATION, 2007, 116 (02) :151-157
[5]   The Fatty Liver Index: a simple and accurate predictor of hepatic steatosis in the general population [J].
Bedogni, Giorgio ;
Bellentani, Stefano ;
Miglioli, Lucia ;
Masutti, Flora ;
Passalacqua, Marilena ;
Castiglione, Anna ;
Tiribelli, Claudio .
BMC GASTROENTEROLOGY, 2006, 6 (1)
[6]  
Breslow NE, 2018, CH CRC HANDB MOD STA, P303
[7]   Using the Whole Cohort in the Analysis of Case-Control Data: Application to the Women's Health Initiative [J].
Breslow N.E. ;
Amorim G. ;
Pettinger M.B. ;
Rossouw J. .
Statistics in Biosciences, 2013, 5 (2) :232-249
[8]   Serum sphingolipids and incident diabetes in a US population with high diabetes burden: the Hispanic Community Health Study/Study of Latinos (HCHS/SOL) [J].
Chen, Guo-Chong ;
Chai, Jin Choul ;
Yu, Bing ;
Michelotti, Gregory A. ;
Grove, Megan L. ;
Fretts, Amanda M. ;
Daviglus, Martha L. ;
Garcia-Bedoya, Olga L. ;
Thyagarajan, Bharat ;
Schneiderman, Neil ;
Cai, Jianwen ;
Kaplan, Robert C. ;
Boerwinkle, Eric ;
Qi, Qibin .
AMERICAN JOURNAL OF CLINICAL NUTRITION, 2020, 112 (01) :57-65
[9]   Hypertension and Risk of Renal Cell Carcinoma Among White and Black Americans [J].
Colt, Joanne S. ;
Schwartz, Kendra ;
Graubard, Barry I. ;
Davis, Faith ;
Ruterbusch, Julie ;
DiGaetano, Ralph ;
Purdue, Mark ;
Rothman, Nathaniel ;
Wacholder, Sholom ;
Chow, Wong-Ho .
EPIDEMIOLOGY, 2011, 22 (06) :797-804
[10]   The relationship between glucose and incident cardiovascular events [J].
Coutinho, M ;
Gerstein, HC ;
Wang, Y ;
Yusuf, S .
DIABETES CARE, 1999, 22 (02) :233-240