The impact of record-linkage bias in the Cox model

被引:23
作者
Baldi, Ileana
Ponti, Antonio [3 ]
Zanetti, Roberto [4 ]
Ciccone, Giovannino [2 ]
Merletti, Franco
Gregori, Dario [1 ]
机构
[1] Univ Turin, Dept Publ Hlth & Microbiol, I-10126 Turin, Italy
[2] CPO Piemonte, Canc Epidemiol Unit, Clin Epidemiol Sect, Turin, Italy
[3] CPO Piemonte, Epidemiol Unit, Turin, Italy
[4] CPO Piemonte, Piedmont Canc Registry, Turin, Italy
关键词
breast cancer; Cox model; estimation bias; proportional hazard; record-linkage; CANCER; TESTS;
D O I
10.1111/j.1365-2753.2009.01119.x
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Rationale, aims and objectives Record linkage (RL) has become increasingly useful in health care administration, demographic studies, provision of health statistics and medical research. Linkage failure may occur when databases are affected by missing or inaccurate information. In particular, if the subsets of those who are not linked are not representative of the original population, the results obtained from linked data may be biased. This paper discusses the impact of incomplete RL on survival analysis. Methods In our study we assess by simulations the potential impact of such bias, that we will refer to as RL, on the effect of the covariates in the Cox regression model. We also evaluate the RL bias introduced by an incomplete linkage procedure on the analysis of survival in a cohort of patients with breast cancer. Results Our simulation study shows that the relative bias of the factors, which the linking probability depends on, reaches the threshold of 20%, and is never less than 5%. The bias observed in the simulation for a comparable scenario is consistent with the actual one estimated from the breast cancer records. Conclusions Incomplete RL is rarely explicitly taken into account in the models for survival analysis. This study indicates that such a practice is potentially leading to inefficient and biased results, in particular in presence of medium or small sample sizes.
引用
收藏
页码:92 / 96
页数:5
相关论文
共 24 条
[1]  
Baldi I, 2006, AUST J STAT, V35, P77
[2]   Generating survival times to simulate Cox proportional hazards models [J].
Bender, R ;
Augustin, T ;
Blettner, M .
STATISTICS IN MEDICINE, 2005, 24 (11) :1713-1723
[3]  
COX DR, 1972, J R STAT SOC B, V187, P220
[4]   A THEORY FOR RECORD LINKAGE [J].
FELLEGI, IP ;
SUNTER, AB .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1969, 64 (328) :1183-&
[5]  
GRAMBSCH PM, 1994, BIOMETRIKA, V81, P515
[6]   Non-ignorable missing covariate data in survival analysis: a case-study of an International Breast Cancer Study Group trial [J].
Herring, AH ;
Ibrahim, JG ;
Lipsitz, SR .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2004, 53 :293-310
[7]   Tests for the proportional intensity assumption based on the score process [J].
Kvaloy, JT ;
Neef, LR .
LIFETIME DATA ANALYSIS, 2004, 10 (02) :139-157
[8]   Regression analysis with linked data [J].
Lahiri, P ;
Larsen, MD .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2005, 100 (469) :222-230
[9]   Iterative automated record linkage using mixture models [J].
Larsen, MD ;
Rubin, DB .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (453) :32-41
[10]   Blood pressure, smoking, and the incidence of lung cancer in hypertensive men in North Karelia, Finland [J].
Lindgren, A ;
Pukkala, E ;
Nissinen, A ;
Tuomilehto, J .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 2003, 158 (05) :442-447