Detecting irregularities in randomized controlled trials using machine learning

被引：0

作者：

Nelson, Walter ^{[1
]}

Petch, Jeremy ^{[1
,2
,3
]}

Ranisau, Jonathan

Zhao, Robin

Balasubramanian, Kumar

Bangdiwala, Shrikant, I ^{[2
]}

机构：

[1] Hamilton Hlth Sci, Ctr Data Sci & Digital Hlth, Hamilton, ON, Canada

[2] Populat Hlth Res Inst, 20 Copeland Ave, Hamilton L8L 2X2, ON, Canada

[3] McMaster Univ, Dept Med, Hamilton, ON, Canada

来源：

CLINICAL TRIALS | 2024年

关键词：

Central statistical monitoring; machine learning; artificial intelligence; outlier detection; quality assurance; data quality; randomized controlled trials; DABIGATRAN;

D O I：

10.1177/17407745241297947

中图分类号：

R-3 [医学研究方法]; R3 [基础医学];

学科分类号：

1001 ;

摘要：

Background: Over the course of a clinical trial, irregularities may arise in the data. Trialists implement human-intensive, expensive central statistical monitoring procedures to identify and correct these irregularities before the results of the trial are analyzed and disseminated. Machine learning algorithms have shown promise for identifying center-level irregularities in multi-center clinical trials with minimal human intervention. We aimed to characterize the form-level data irregularities in several historical clinical trials and evaluate the ability of a machine learning-based outlier detection algorithm to identify them.Methods: Data irregularities previously identified by humans in historical clinical trials were ascertained by comparing preliminary snapshots of the trial databases to the final, locked databases. We measured the ability of a machine learning based outlier detection algorithm to identify form-level irregularities using concordance (area under the receiver operator characteristic), positive predictive value (precision), and sensitivity (recall).Results: We examined preliminary snapshots of seven historical clinical trials which randomized a total of 77,001 participants. We extracted a total of 1,267,484 completed entries from 358 case report forms containing irregularities from all snapshots across all trials, containing a total of 24,850 form-wide irregularities (median per-form form-level irregularity rate: 1.81%). Our proposed machine learning algorithm detects form-level irregularities with a median concordance of 0.74 (interquartile range = 0.57-0.89), slightly exceeding the performance of a previously proposed machine learning approach with a median area under the receiver operator characteristic of 0.73 (interquartile range = 0.54-0.88).Conclusion: Data irregularities in historical clinical trials were ascertained by comparing preliminary snapshots of the trial database to the final database. These irregularities can be categorized according to their scope. Irregularities can be successfully detected by a machine learning algorithm as early or earlier than a human can, without human intervention. Such an approach may complement existing techniques for central statistical monitoring in large multi-center randomized controlled trials and possibly improve the efficiency of costly data verification processes.

引用

页码：178 / 187

页数：10

共 50 条

[41] Detecting Heart Anomalies Using Mobile Phones and Machine Learning
Talab, Elhoussine
Mohamed, Omar
Begum, Labeeba
Aloul, Fadi
Sagahyroon, Assim
2019 IEEE 19TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2019, : 428 - 432
[42] Detecting Anomalies in Financial Data Using Machine Learning Algorithms
Bakumenko, Alexander
Elragal, Ahmed
SYSTEMS, 2022, 10 (05):
[43] Detecting air-gapped attacks using machine learning
Zhu, Weijun
Rodrigues, Joel J. P. C.
Niu, Jianwei
Zhou, Qinglei
Li, Yafei
Xu, Mingliang
Huang, Bohu
COGNITIVE SYSTEMS RESEARCH, 2019, 57 : 92 - 100
[44] Detecting Malware in Cyberphysical Systems Using Machine Learning: a Survey
Montes, F.
Bermejo, J.
Sanchez, L. E.
Bermejo, J. R.
Sicilia, J. A.
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (03) : 1119 - 1139
[45] Design of randomized controlled trials
Stanley, Kenneth
CIRCULATION, 2007, 115 (09) : 1164 - 1169
[46] Randomized controlled trials in PD
Blake, Peter G.
NEPHROLOGY DIALYSIS TRANSPLANTATION, 2007, 22 (10) : 2746 - 2748
[47] Detecting IoT Attacks Using an Ensemble Machine Learning Model
Tomer, Vikas
Sharma, Sachin
FUTURE INTERNET, 2022, 14 (04):
[48] Detecting Malicious Domains using the Splunk Machine Learning Toolkit
Cersosimo, Michelle
Lara, Adrian
PROCEEDINGS OF THE IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM 2022, 2022,
[49] Detecting Suicidality in Arabic Tweets Using Machine Learning and Deep Learning Techniques
Abdulsalam, Asma
Alhothali, Areej
Al-Ghamdi, Saleh
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (09) : 12729 - 12742
[50] Detecting Falls with Wearable Sensors Using Machine Learning Techniques
Ozdemir, Ahmet Turan
Barshan, Billur
SENSORS, 2014, 14 (06) : 10691 - 10708

← 1 2 3 4 5 →