Fast and optimal algorithm for case-control matching using registry data: application on the antibiotics use of colorectal cancer patients

被引:22
作者
Mamouris, Pavlos [1 ]
Nassiri, Vahid [2 ]
Molenberghs, Geert [3 ,4 ]
van den Akker, Marjan [1 ,5 ,6 ]
van der Meer, Joep [1 ]
Vaes, Bert [1 ]
机构
[1] Katholieke Univ Leuven, Dept Publ Hlth & Primary Care, Kapucijnenvoer 33,J Bldg, B-3000 Leuven, Belgium
[2] Open Analyt NV, Antwerp, Belgium
[3] Univ Leuven, KU Leuven, I BioStat, Leuven, Belgium
[4] Hasselt Univ, I BioStat, Diepenbeek, Belgium
[5] Maastricht Univ, Dept Family Med, Care & Publ Hlth Res Inst, Maastricht, Netherlands
[6] Goethe Univ, Inst Gen Practice, Frankfurt, Germany
关键词
Case-control; Optimal matching; Comorbidity index; Colorectal cancer; GENERAL-PRACTICE; RISK; COHORT; BIAS;
D O I
10.1186/s12874-021-01256-3
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background In case-control studies most algorithms allow the controls to be sampled several times, which is not always optimal. If many controls are available and adjustment for several covariates is necessary, matching without replacement might increase statistical efficiency. Comparing similar units when having observational data is of utter importance, since confounding and selection bias is present. The aim was twofold, firstly to create a method that accommodates the option that a control is not resampled, and second, to display several scenarios that identify changes of Odds Ratios (ORs) while increasing the balance of the matched sample. Methods The algorithm was derived in an iterative way starting from the pre-processing steps to derive the data until its application in a study to investigate the risk of antibiotics on colorectal cancer in the INTEGO registry (Flanders, Belgium). Different scenarios were developed to investigate the fluctuation of ORs using the combination of exact and varying variables with or without replacement of controls. To achieve balance in the population, we introduced the Comorbidity Index (CI) variable, which is the sum of chronic diseases as a means to have comparable units for drawing valid associations. Results This algorithm is fast and optimal. We simulated data and demonstrated that the run-time of matching even with millions of patients is minimal. Optimal, since the closest controls is always captured (using the appropriate ordering and by creating some auxiliary variables), and in the scenario that a case has only one control, we assure that this control will be matched to this case, thus maximizing the cases to be used in the analysis. In total, 72 different scenarios were displayed indicating the fluctuation of ORs, and revealing patterns, especially a drop when balancing the population. Conclusions We created an optimal and computationally efficient algorithm to derive a matched case-control sample with and without replacement of controls. The code and the functions are publicly available as an open source in an R package. Finally, we emphasize the importance of displaying several scenarios and assess the difference of ORs while using an index to balance population in observational data.
引用
收藏
页数:9
相关论文
共 24 条
[1]   A comparison of observational studies and randomized, controlled trials. [J].
Benson, K ;
Hartz, AJ .
NEW ENGLAND JOURNAL OF MEDICINE, 2000, 342 (25) :1878-1886
[2]   The use and misuse of matching in case-control studies: the example of polycystic ovary syndrome [J].
Bloom, Michael S. ;
Schisterman, Enrique F. ;
Hediger, Mary L. .
FERTILITY AND STERILITY, 2007, 88 (03) :707-710
[3]   Impact of antibiotic exposure on the risk of colorectal cancer [J].
Boursi, Ben ;
Haynes, Kevin ;
Mamtani, Ronac ;
Yang, Yu-Xiao .
PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2015, 24 (05) :534-542
[4]  
Carstensen B, 2008, Epi: A Package for Statistical Analysis in Epidemiology, V1
[5]   Randomized, controlled trials, observational studies, and the hierarchy of research designs. [J].
Concato, J ;
Shah, N ;
Horwitz, RI .
NEW ENGLAND JOURNAL OF MEDICINE, 2000, 342 (25) :1887-1892
[6]   Frequent Use of Antibiotics Is Associated with Colorectal Cancer Risk: Results of a Nested Case-Control Study [J].
Dik, Vincent K. ;
van Oijen, Martijn G. H. ;
Smeets, Hugo M. ;
Siersema, Peter D. .
DIGESTIVE DISEASES AND SCIENCES, 2016, 61 (01) :255-264
[7]  
Elwood M., 2017, CRITICAL APPRAISAL E, DOI [10.1093/med/9780199682898.003.0012, DOI 10.1093/MED/9780199682898.003.0012]
[8]   Human oral microbiome and prospective risk for pancreatic cancer: a population-based nested case-control study [J].
Fan, Xiaozhou ;
Alekseyenko, Alexander V. ;
Wu, Jing ;
Peters, Brandilyn A. ;
Jacobs, Eric J. ;
Gapstur, Susan M. ;
Purdue, Mark P. ;
Abnet, Christian C. ;
Stolzenberg-Solomon, Rachael ;
Miller, George ;
Ravel, Jacques ;
Hayes, Richard B. ;
Ahn, Jiyoung .
GUT, 2018, 67 (01) :120-127
[9]   Using real-world data for coverage and payment decisions: The ISPOR real-world data task force report [J].
Garrison, Louis P., Jr. ;
Neumann, Peter J. ;
Erickson, Pennifer ;
Marshall, Deborah ;
Mullins, Daniel .
VALUE IN HEALTH, 2007, 10 (05) :326-335
[10]   RESEARCH METHODS IN EPIDEMIOLOGY .5. BIAS IN CASE-CONTROL STUDIES - A REVIEW [J].
KOPEC, JA ;
ESDAILE, JM .
JOURNAL OF EPIDEMIOLOGY AND COMMUNITY HEALTH, 1990, 44 (03) :179-186