Accuracy of probabilistic and deterministic record linkage: the case of tuberculosis

被引:20
|
作者
de Oliveira, Gisele Pinto [1 ]
de Souza Bierrenbach, Ana Luiza [2 ]
de Camargo Junior, Kenneth Rochel [3 ]
Coeli, Claudia Medina [4 ]
Pinheiro, Rejane Sobrino [4 ]
机构
[1] Univ Fed Rio de Janeiro, Inst Estudos Saude Colet, Programa Posgrad Saude Colet, Rio De Janeiro, RJ, Brazil
[2] Hosp Sirio Libanes, Inst Ensino & Pesquisa, Sao Paulo, SP, Brazil
[3] Univ Estado Rio de Janeiro, Inst Med Social, Rio De Janeiro, RJ, Brazil
[4] Univ Fed Rio de Janeiro, Inst Estudos Saude Colet, Rio De Janeiro, RJ, Brazil
来源
REVISTA DE SAUDE PUBLICA | 2016年 / 50卷
关键词
Tuberculosis; epidemiology; Data Accuracy; Sensitivity and Specificity; Epidemiological Surveillance; statistics & numerical data;
D O I
10.1590/S1518-8787.2016050006327
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
OBJECTIVE: To analyze the accuracy of deterministic and probabilistic record linkage to identify TB duplicate records, as well as the characteristics of discordant pairs. METHODS: The study analyzed all TB records from 2009 to 2011 in the state of Rio de Janeiro. A deterministic record linkage algorithm was developed using a set of 70 rules, based on the combination of fragments of the key variables with or without modification (Soundex or substring). Each rule was formed by three or more fragments. The probabilistic approach required a cutoff point for the score, above which the links would be automatically classified as belonging to the same individual. The cutoff point was obtained by linkage of the Notifiable Diseases Information System - Tuberculosis database with itself, subsequent manual review and ROC curves and precision-recall. Sensitivity and specificity for accurate analysis were calculated. RESULTS: Accuracy ranged from 87.2% to 95.2% for sensitivity and 99.8% to 99.9% for specificity for probabilistic and deterministic record linkage, respectively. The occurrence of missing values for the key variables and the low percentage of similarity measure for name and date of birth were mainly responsible for the failure to identify records of the same individual with the techniques used. CONCLUSIONS: The two techniques showed a high level of correlation for pair classification. Although deterministic linkage identified more duplicate records than probabilistic linkage, the latter retrieved records not identified by the former. User need and experience should be considered when choosing the best technique to be used.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Risk of fractures in patients with multiple sclerosis: record-linkage study
    Ramagopalan, Sreeram V.
    Seminog, Olena
    Goldacre, Raphael
    Goldacre, Michael J.
    BMC NEUROLOGY, 2012, 12
  • [42] Association Between Cholecystectomy and Intestinal Cancer A National Record Linkage Study
    Goldacre, Michael J.
    Wotton, Clare J.
    Abisgold, Julie
    Yeates, David G. R.
    Collins, John
    ANNALS OF SURGERY, 2012, 256 (06) : 1068 - 1072
  • [43] Is chronic pain associated with subsequent cancer? A cohort record linkage study
    Elliott, Alison M.
    Torrance, Nicola
    Smith, Blair H.
    Lee, Amanda J.
    EUROPEAN JOURNAL OF PAIN, 2010, 14 (08) : 860 - 863
  • [44] Maternal and perinatal risk factors for childhood cancer: record linkage study
    Bhattacharya, Sohinee
    Beasley, Marcus
    Pang, Dong
    Macfarlane, Gary J.
    BMJ OPEN, 2014, 4 (01):
  • [45] A comparison of accuracy and computational feasibility of two record linkage algorithms in retrieving vital status information from HIV/AIDS patients registered in Brazilian public databases
    de Paula, Adelzon Assis
    Pires, Denise Franqueira
    Alves Filho, Pedro
    Valente de Lemos, Katia Regina
    Barcante, Eduardo
    Pacheco, Antonio Guilherme
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2018, 114 : 45 - 51
  • [46] Diagnostic Accuracy of a Rapid Urine Lipoarabinomannan Test for Tuberculosis in HIV-Infected Adults
    Nakiyingi, Lydia
    Moodley, Vineshree Mischka
    Manabe, Yukari C.
    Nicol, Mark P.
    Holshouser, Molly
    Armstrong, Derek T.
    Zemanay, Widaad
    Sikhondze, Welile
    Mbabazi, Olive
    Nonyane, Bareng A. S.
    Shah, Maunank
    Joloba, Moses L.
    Alland, David
    Ellner, Jerrold J.
    Dorman, Susan E.
    JAIDS-JOURNAL OF ACQUIRED IMMUNE DEFICIENCY SYNDROMES, 2014, 66 (03) : 270 - 279
  • [47] Multifocal tuberculosis in children: A case of spinal tuberculosis
    Aouraghe, Hanae
    Benchekroun, Soumia
    Mahraoui, Chafiq
    ElHafidi, Naima
    INTERNATIONAL JOURNAL OF MYCOBACTERIOLOGY, 2023, 12 (02) : 204 - 206
  • [48] Diagnostic accuracy of the NOVA Tuberculosis Total Antibody Rapid test for detection of pulmonary tuberculosis and infection with Mycobacterium tuberculosis
    Nsubuga, Gideon
    Kennedy, Samuel
    Rani, Yasha
    Hafiz, Zibran
    Kim, Soyeon
    Ruhwald, Morten
    Alland, David
    Ellner, Jerrold
    Joloba, Moses
    Dorman, Susan E.
    Penn-Nicholson, Adam
    Nakiyingi, Lydia
    JOURNAL OF CLINICAL TUBERCULOSIS AND OTHER MYCOBACTERIAL DISEASES, 2023, 31
  • [49] Linking mothers and infants within electronic health records: a comparison of deterministic and probabilistic algorithms
    Baldwin, Eric
    Johnson, Karin
    Berthoud, Heidi
    Dublin, Sascha
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2015, 24 (01) : 45 - 51
  • [50] Tuberculosis-Associated Septic Shock: A Case Series
    Arya, Veerendra
    Shukla, Amarendra K.
    Prakash, Brahma
    Bhargava, Jitendra K.
    Gupta, Akriti
    Patel, Brij B.
    Tiwari, Pawan
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2022, 14 (03)