Accuracy of probabilistic and deterministic record linkage: the case of tuberculosis

被引:20
|
作者
de Oliveira, Gisele Pinto [1 ]
de Souza Bierrenbach, Ana Luiza [2 ]
de Camargo Junior, Kenneth Rochel [3 ]
Coeli, Claudia Medina [4 ]
Pinheiro, Rejane Sobrino [4 ]
机构
[1] Univ Fed Rio de Janeiro, Inst Estudos Saude Colet, Programa Posgrad Saude Colet, Rio De Janeiro, RJ, Brazil
[2] Hosp Sirio Libanes, Inst Ensino & Pesquisa, Sao Paulo, SP, Brazil
[3] Univ Estado Rio de Janeiro, Inst Med Social, Rio De Janeiro, RJ, Brazil
[4] Univ Fed Rio de Janeiro, Inst Estudos Saude Colet, Rio De Janeiro, RJ, Brazil
来源
REVISTA DE SAUDE PUBLICA | 2016年 / 50卷
关键词
Tuberculosis; epidemiology; Data Accuracy; Sensitivity and Specificity; Epidemiological Surveillance; statistics & numerical data;
D O I
10.1590/S1518-8787.2016050006327
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
OBJECTIVE: To analyze the accuracy of deterministic and probabilistic record linkage to identify TB duplicate records, as well as the characteristics of discordant pairs. METHODS: The study analyzed all TB records from 2009 to 2011 in the state of Rio de Janeiro. A deterministic record linkage algorithm was developed using a set of 70 rules, based on the combination of fragments of the key variables with or without modification (Soundex or substring). Each rule was formed by three or more fragments. The probabilistic approach required a cutoff point for the score, above which the links would be automatically classified as belonging to the same individual. The cutoff point was obtained by linkage of the Notifiable Diseases Information System - Tuberculosis database with itself, subsequent manual review and ROC curves and precision-recall. Sensitivity and specificity for accurate analysis were calculated. RESULTS: Accuracy ranged from 87.2% to 95.2% for sensitivity and 99.8% to 99.9% for specificity for probabilistic and deterministic record linkage, respectively. The occurrence of missing values for the key variables and the low percentage of similarity measure for name and date of birth were mainly responsible for the failure to identify records of the same individual with the techniques used. CONCLUSIONS: The two techniques showed a high level of correlation for pair classification. Although deterministic linkage identified more duplicate records than probabilistic linkage, the latter retrieved records not identified by the former. User need and experience should be considered when choosing the best technique to be used.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Record-linkage studies of the coexistence of epilepsy and bipolar disorder
    Clare J. Wotton
    Michael J. Goldacre
    Social Psychiatry and Psychiatric Epidemiology, 2014, 49 : 1483 - 1488
  • [32] Diagnostic accuracy of three technologies for the diagnosis of multi-drug resistant tuberculosis
    Jose Alvis-Zakzuk, Nelson
    de Los Angeles Carrasquilla, Maria
    Jhajaira Gomez, Veronica
    Robledo, Jaime
    Rafael Alvis-Guzman, Nelson
    Mauricio Hernandez, Jose
    BIOMEDICA, 2017, 37 (03): : 397 - +
  • [33] Bovine tuberculosis: making a case for effective surveillance
    Probst, C.
    Freuling, C.
    Moser, I.
    Geue, L.
    Koehler, H.
    Conraths, F. J.
    Hotzel, H.
    Liebler-Tenorio, E. M.
    Kramer, M.
    EPIDEMIOLOGY AND INFECTION, 2011, 139 (01) : 105 - 112
  • [34] A rare case of spinal tuberculosis due to Mycobacterium bovis. Is zoonotic tuberculosis underdiagnosed?
    Claro-Almea, Franklin E.
    Delgado-Noguera, Lourdes A.
    Motaban, Ana
    Espana, Mercedes
    de Waard, Jacobus H.
    IDCASES, 2020, 22
  • [35] Sternal swelling presenting as tuberculosis: a case report
    Rajan, John
    Bizanti, Khaled
    JOURNAL OF MEDICAL CASE REPORTS, 2021, 15 (01)
  • [36] Management of a tuberculosis case by an occupational health service
    Vrai
    Faux
    ARCHIVES DES MALADIES PROFESSIONNELLES ET DE L ENVIRONNEMENT, 2012, 73 (06) : 909 - +
  • [37] Accuracy of classification of notified tuberculosis cases in Taiwan
    Chiang, C-Y.
    Luh, K-T.
    Enarson, D. A.
    Yang, S-L.
    Wu, Y-C.
    Lin, T-P.
    INTERNATIONAL JOURNAL OF TUBERCULOSIS AND LUNG DISEASE, 2007, 11 (08) : 876 - 881
  • [38] The National Tuberculosis Surveillance System Training Program to Ensure Accuracy of Tuberculosis Data
    Magee, Elvin
    Tryon, Cheryl
    Forbes, Alstead
    Manangan, Lilia
    JOURNAL OF PUBLIC HEALTH MANAGEMENT AND PRACTICE, 2011, 17 (05) : 427 - 430
  • [39] Miliary tuberculosis - Diagnostic accuracy of chest radiography
    Kwong, JS
    Carignan, S
    Kang, EY
    Muller, NL
    FitzGerald, JM
    CHEST, 1996, 110 (02) : 339 - 342
  • [40] Risk of fractures in patients with multiple sclerosis: record-linkage study
    Sreeram V Ramagopalan
    Olena Seminog
    Raphael Goldacre
    Michael J Goldacre
    BMC Neurology, 12