A test for treatment effects in randomized controlled trials, harnessing the power of ultrahigh dimensional big data

被引：4

作者：

Lee, Wen-Chung ^{[1
]}

Lin, Jui-Hsiang ^{[1
]}

机构：

[1] Natl Taiwan Univ, Coll Publ Hlth, Inst Epidemiol & Prevent Med, Taipei, Taiwan

来源：

MEDICINE | 2019年 / 98卷 / 43期

关键词：

big data; biostatistics; data mining; potential-outcome model; randomized controlled trial; sample size; sharp null; GEOMETRIC REPRESENTATION; SELECTION;

D O I：

10.1097/MD.0000000000017630

中图分类号：

R5 [内科学];

学科分类号：

1002 ; 100201 ;

摘要：

Background: The randomized controlled trial (RCT) is the gold-standard research design in biomedicine. However, practical concerns often limit the sample size, n, the number of patients in a RCT. We aim to show that the power of a RCT can be increased by increasing p, the number of baseline covariates (sex, age, socio-demographic, genomic, and clinical profiles et al, of the patients) collected in the RCT (referred to as the 'dimension'). Methods: The conventional test for treatment effects is based on testing the 'crude null' that the outcomes of the subjects are of no difference between the two arms of a RCT. We propose a 'high-dimensional test' which is based on testing the 'sharp null' that the experimental intervention has no treatment effect whatsoever, for patients of any covariate profile. Results: Using computer simulations, we show that the high-dimensional test can become very powerful in detecting treatment effects for very large p, but not so for small or moderate p. Using a real dataset, we demonstrate that the P value of the high-dimensional test decreases as the number of baseline covariates increases, though it is still not significant. Conclusion: In this big-data era, pushing p of a RCT to the millions, billions, or even trillions may someday become feasible. And the high-dimensional test proposed in this study can become very powerful in detecting treatment effects.

引用

页数：7

共 19 条

[1] The high-dimension, low-sample-size geometric representation holds under mild conditions [J].

Ahn, Jeongyoun ;

Marron, J. S. ;

Muller, Keith M. ;

Chi, Yueh-Yun .

BIOMETRIKA, 2007, 94 (03) :760-766

[2] From big data analysis to personalized medicine for all: challenges and opportunities [J].

Alyass, Akram ;

Turcotte, Michelle ;

Meyre, David .

BMC MEDICAL GENOMICS, 2015, 8

[3]

[Anonymous], 2008, Modern epidemiology

[4] Bounds on causal effects in randomized trials with noncompliance under monotonicity assumptions about covariates [J].

Chiba, Yasutaka .

STATISTICS IN MEDICINE, 2009, 28 (26) :3249-3259

[5] KERNEL DIMENSION REDUCTION IN REGRESSION [J].

Fukumizu, Kenji ;

Bach, Francis R. ;

Jordan, Michael I. .

ANNALS OF STATISTICS, 2009, 37 (04) :1871-1905

[6] Geometric representation of high dimension, low sample size data [J].

Hall, P ;

Marron, JS ;

Neeman, A .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2005, 67 :427-444

[7] Matrix variate logistic regression model with application to EEG data [J].

Hung, Hung ;

Wang, Chen-Chien .

BIOSTATISTICS, 2013, 14 (01) :189-202

[8] Clinical research methodology I: Introduction to randomized trials [J].

Kao, Lillian S. ;

Tyson, Jon E. ;

Blakely, Martin L. ;

Lally, Kevin P. .

JOURNAL OF THE AMERICAN COLLEGE OF SURGEONS, 2008, 206 (02) :361-369

[9] Big Data And New Knowledge In Medicine: The Thinking, Training, And Tools Needed For A Learning Health System [J].

Krumholz, Harlan M. .

HEALTH AFFAIRS, 2014, 33 (07) :1163-1170

[10] INTRODUCTION TO SAMPLE-SIZE DETERMINATION AND POWER ANALYSIS FOR CLINICAL-TRIALS [J].

LACHIN, JM .

CONTROLLED CLINICAL TRIALS, 1981, 2 (02) :93-113

← 1 2 →