Data cleaning and harmonization of clinical trial data: Medication-assisted treatment for opioid use disorder

被引:0
作者
Balise, Raymond R. [1 ]
Hu, Mei-Chen [2 ]
Calderon, Anna R. [1 ]
Odom, Gabriel J. [3 ]
Brandt, Laura [4 ]
Luo, Sean X. [2 ]
Feaster, Daniel J. [1 ]
机构
[1] Univ Miami, Miller Sch Med, Dept Publ Hlth Sci, Miami, FL 33136 USA
[2] Columbia Univ, Vagelos Coll Phys & Surg, Dept Psychiat, New York, NY USA
[3] Florida Int Univ, Stempel Coll Publ Hlth, Dept Biostat, Miami, FL USA
[4] Coll City New York, Dept Psychol, New York, NY USA
来源
PLOS ONE | 2024年 / 19卷 / 11期
关键词
BUPRENORPHINE-NALOXONE; RELIABILITY; METHADONE; HEALTH; VALIDITY; OUTCOMES; VERSION; DRUG;
D O I
10.1371/journal.pone.0312695
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Several large-scale, pragmatic clinical trials on opioid use disorder (OUD) have been completed in the National Drug Abuse Treatment Clinical Trials Network (CTN). However, the resulting data have not been harmonized between the studies to compare the patient characteristics. This paper provides lessons learned from a large-scale harmonization process that are critical for all biomedical researchers collecting new data and those tasked with combining datasets. We harmonized data from multiple domains from CTN-0027 (N = 1269), which compared methadone and buprenorphine at federally licensed methadone treatment programs; CTN-0030 (N = 653), which recruited patients who used predominantly prescription opioids and were treated with buprenorphine; and CTN-0051 (N = 570), which compared buprenorphine and extended-release naltrexone (XR-NTX) and recruited from inpatient treatment facilities. Patient-level data were harmonized and a total of 23 database tables, with meticulous documentation, covering more than 110 variables, along with three tables with "meta-data" about the study design and treatment arms, were created. Domains included: social and demographic characteristics, medical and psychiatric history, self-reported drug use details and urine drug screening results, withdrawal, and treatment drug details. Here, we summarize the numerous issues with the organization and fidelity of the publicly available data which were noted and resolved, and present results on patient characteristics across the three trials and the harmonized domains, respectively. A systematic harmonization of OUD clinical trial data can be accomplished, despite heterogeneous data coding and classification procedures, by standardizing commonly assessed characteristics. Similar methods, embracing database normalization and/or "tidy" data, should be used for future datasets in other substance use disorder clinical trials.
引用
收藏
页数:37
相关论文
共 64 条
  • [61] Wickham H., 2016, R for data science: import, tidy, transform, visualize, and model data
  • [62] Wikipedia contributors, Database normalization
  • [63] Wikipedia contributors, Snake Case
  • [64] Xie Y, 2024, knitr: A General-Purpose Package for Dynamic Report Generation in R