Evaluating latent class models with conditional dependence in record linkage

被引:11
作者
Daggy, Joanne [1 ]
Xu, Huiping [1 ]
Hui, Siu [1 ,2 ]
Grannis, Shaun [2 ,3 ]
机构
[1] Indiana Univ Sch Med, Dept Biostat, Indianapolis, IN 46202 USA
[2] Regenstrief Inst Hlth Care, Indianapolis, IN 46202 USA
[3] Indiana Univ Sch Med, Dept Family Med, Indianapolis, IN 46202 USA
基金
美国医疗保健研究与质量局;
关键词
latent class; record linkage; loglinear model; random effects; DIAGNOSTIC-TEST PERFORMANCE; GOLD STANDARD; TESTS; ACCURACY;
D O I
10.1002/sim.6230
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Record linkage methods commonly use a traditional latent class model to classify record pairs from different sources as true matches or non-matches. This approach was first formally described by Fellegi and Sunter and assumes that the agreement in fields is independent conditional on the latent class. Consequences of violating the conditional independence assumption include bias in parameter estimates from the model. We sought to further characterize the impact of conditional dependence on the overall misclassification rate, sensitivity, and positive predictive value in the record linkage problem when the conditional independence assumption is violated. Additionally, we evaluate various methods to account for the conditional dependence. These methods include loglinear models with appropriate interaction terms identified through the correlation residual plot as well as Gaussian random effects models. The proposed models are used to link newborn screening data obtained from a health information exchange. On the basis of simulations, loglinear models with interaction terms demonstrated the best misclassification rate, although this type of model cannot accommodate other data features such as continuous measures for agreement. Results indicate that Gaussian random effects models, which can handle additional data features, perform better than assuming conditional independence and in some situations perform as well as the loglinear model with interaction terms. Copyright (c) 2014 John Wiley & Sons, Ltd.
引用
收藏
页码:4250 / 4265
页数:16
相关论文
共 37 条
[1]   A cautionary note on the robustness of latent class models for estimating diagnostic error without a gold standard [J].
Albert, PS ;
Dodd, LE .
BIOMETRICS, 2004, 60 (02) :427-435
[2]  
[Anonymous], 1993, P SECTION SURVEY RES
[3]  
[Anonymous], 1990, String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage
[4]  
[Anonymous], 1995, HDB STAT MODELING SO
[5]  
[Anonymous], 2003, IIWeb
[6]  
Armstrong M., 1993, SURV METHODOL, V19, P137
[7]  
BELIN TR, 1995, J AM STAT ASSOC, V90, P694
[8]   Probabilistic record linkage and a method to calculate the positive predictive value [J].
Blakely, T ;
Salmond, C .
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2002, 31 (06) :1246-1252
[9]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[10]   Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests [J].
Dendukuri, N ;
Joseph, L .
BIOMETRICS, 2001, 57 (01) :158-167