Improving main analysis by borrowing information from auxiliary data

被引:11
作者
Chen, Chixiang [1 ]
Han, Peisong [2 ]
He, Fan [3 ]
机构
[1] Univ Maryland, Dept Epidemiol & Publ Hlth, Div Biostat & Bioinformat, Sch Med, Baltimore, MD 21201 USA
[2] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[3] Penn State Coll Med, Dept Publ Hlth Sci, Div Biostat & Bioinformat, Hershey, PA USA
关键词
auxiliary data; empirical likelihood; estimation efficiency improvement; information borrowing; information index; EMPIRICAL-LIKELIHOOD; LONGITUDINAL DATA; ATHEROSCLEROSIS RISK; REGRESSION-ANALYSIS; COMMUNITIES; IMPUTATION; OUTCOMES; MODELS; HEALTH;
D O I
10.1002/sim.9252
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In many clinical and observational studies, auxiliary data from the same subjects, such as repeated measurements or surrogate variables, will be collected in addition to the data of main interest. Not directly related to the main study, these auxiliary data in practice are rarely incorporated into the main analysis, though they may carry extra information that can help improve the estimation in the main analysis. Under the setting where part of or all subjects have auxiliary data available, we propose an effective weighting approach to borrow the auxiliary information by building a working model for the auxiliary data, where improvement of estimation precision over the main analysis is guaranteed regardless of the specification of the working model. An information index is also constructed to assess how well the selected working model works to improve the main analysis. Both theoretical and numerical studies show the excellent and robust performance of the proposed method in comparison to estimation without using the auxiliary data. Finally, we utilize the Atherosclerosis Risk in Communities study for illustration.
引用
收藏
页码:567 / 579
页数:13
相关论文
共 34 条
[1]   Globally efficient non-parametric inference of average treatment effects by empirical balancing calibration weighting [J].
Chan, Kwun Chuen Gary ;
Yam, Sheung Chi Phillip ;
Zhang, Zheng .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2016, 78 (03) :673-700
[2]   Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-Level Information From External Big Data Sources [J].
Chatterjee, Nilanjan ;
Chen, Yi-Hau ;
Maas, Paige ;
Carroll, Raymond J. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (513) :107-117
[3]   A multiple robust propensity score method for longitudinal analysis with intermittent missing data [J].
Chen, Chixiang ;
Shen, Biyi ;
Liu, Aiyi ;
Wu, Rongling ;
Wang, Ming .
BIOMETRICS, 2021, 77 (02) :519-532
[4]   Empirical-likelihood-based criteria for model selection on marginal analysis of longitudinal data with dropout missingness [J].
Chen, Chixiang ;
Shen, Biyi ;
Zhang, Lijun ;
Xue, Yuan ;
Wang, Ming .
BIOMETRICS, 2019, 75 (03) :950-965
[5]   A unified approach to regression analysis under double-sampling designs [J].
Chen, YH .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2000, 62 :449-460
[6]   Informing a risk prediction model for binary outcomes with external coefficient information [J].
Cheng, Wenting ;
Taylor, Jeremy M. G. ;
Gu, Tian ;
Tomlins, Scott A. ;
Mukherjee, Bhramar .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2019, 68 (01) :121-139
[7]  
Enders C. K., 2010, Applied Missing Data Analysis
[8]  
Fried Linda P., 1991, Annals of Epidemiology, V1, P263
[9]   An Introduction to the Augmented Inverse Propensity Weighted Estimator [J].
Glynn, Adam N. ;
Quinn, Kevin M. .
POLITICAL ANALYSIS, 2010, 18 (01) :36-56
[10]   Midlife cardiovascular health and 20-year cognitive decline: Atherosclerosis Risk in Communities Study results [J].
Gonzalez, Hector M. ;
Tarraf, Wassim ;
Harrison, Kimystian ;
Windham, B. Gwen ;
Tingle, Jonathan ;
Alonso, Alvaro ;
Griswold, Michael ;
Heiss, Gerardo ;
Knopman, David ;
Mosley, Thomas H. .
ALZHEIMERS & DEMENTIA, 2018, 14 (05) :579-589