Inverse probability weighting is an effective method to address selection bias during the analysis of high dimensional data

被引:9
|
作者
Carry, Patrick M. [1 ,2 ]
Vanderlinden, Lauren A. [1 ,3 ]
Dong, Fran [4 ]
Buckner, Teresa [1 ]
Litkowski, Elizabeth [1 ]
Vigers, Timothy [3 ]
Norris, Jill M. [1 ]
Kechris, Katerina [3 ]
机构
[1] Colorado Sch Publ Hlth, Dept Epidemiol, Aurora, CO 80045 USA
[2] Univ Colorado, Dept Orthoped, Musculoskeletal Res Ctr, Anschutz Med Campus, Aurora, CO USA
[3] Colorado Sch Publ Hlth, Dept Biostat & Informat, Aurora, CO USA
[4] Univ Colorado, Sch Med, Barbara Davis Ctr Diabet, Anschutz Med Campus, Aurora, CO USA
基金
美国国家卫生研究院;
关键词
DAISY; DNA methylation; inverse probability weighting; selection bias; DIABETES AUTOIMMUNITY; PACKAGE;
D O I
10.1002/gepi.22418
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Omics studies frequently use samples collected during cohort studies. Conditioning on sample availability can cause selection bias if sample availability is nonrandom. Inverse probability weighting (IPW) is purported to reduce this bias. We evaluated IPW in an epigenome-wide analysis testing the association between DNA methylation (261,435 probes) and age in healthy adolescent subjects (n = 114). We simulated age and sex to be correlated with sample selection and then evaluated four conditions: complete population/no selection bias (all subjects), naive selection bias (no adjustment), and IPW selection bias (selection bias with IPW adjustment). Assuming the complete population condition represented the "truth," we compared each condition to the complete population condition. Bias or difference in associations between age and methylation was reduced in the IPW condition versus the naive condition. However, genomic inflation and type 1 error were higher in the IPW condition relative to the naive condition. Postadjustment using bacon, type 1 error and inflation were similar across all conditions. Power was higher under the IPW condition compared with the naive condition before and after inflation adjustment. IPW methods can reduce bias in genome-wide analyses. Genomic inflation is a potential concern that can be minimized using methods that adjust for inflation.
引用
收藏
页码:593 / 603
页数:11
相关论文
共 20 条
  • [11] Inverse-probability weighting and multiple imputation for evaluating selection bias in the estimation of childhood obesity prevalence using data from electronic health records
    Carmen Sayon-Orea
    Conchi Moreno-Iribas
    Josu Delfrade
    Manuela Sanchez-Echenique
    Pilar Amiano
    Eva Ardanaz
    Javier Gorricho
    Garbiñe Basterra
    Marian Nuin
    Marcela Guevara
    BMC Medical Informatics and Decision Making, 20
  • [12] Analysis of Nested Case-Control Study Designs: Revisiting the Inverse Probability Weighting Method
    Kim, Ryung S.
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2013, 20 (06) : 455 - 466
  • [13] Inverse probability weighting for selection bias in a Delaware community health center electronic medical record study of community deprivation and hepatitis C prevalence
    Goldstein, Neal D.
    Kahal, Deborah
    Testa, Karla
    Burstyn, Igor
    ANNALS OF EPIDEMIOLOGY, 2021, 60 : 1 - 7
  • [14] Selection Bias Tracking and Detailed Subset Comparison for High-Dimensional Data
    Borland, David
    Wang, Wenyuan
    Zhang, Jonathan
    Shrestha, Joshua
    Gotz, David
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2020, 26 (01) : 429 - 439
  • [15] Adaptive Contextualization Methods for Combating Selection Bias during High-Dimensional Visualization
    Gotz, David
    Sun, Shun
    Cao, Nan
    Kundu, Rita
    Meyer, Anne-Marie
    ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2017, 7 (04)
  • [16] Effect Estimation in Point-Exposure Studies with Binary Outcomes and High-Dimensional Covariate Data-A Comparison of Targeted Maximum Likelihood Estimation and Inverse Probability of Treatment Weighting
    Pang, Menglan
    Schuster, Tibor
    Filion, Kristian B.
    Schnitzer, Mireille E.
    Eberg, Maria
    Platt, Robert W.
    INTERNATIONAL JOURNAL OF BIOSTATISTICS, 2016, 12 (02)
  • [17] ROBUST INFERENCE WHEN COMBINING INVERSE-PROBABILITY WEIGHTING AND MULTIPLE IMPUTATION TO ADDRESS MISSING DATA WITH APPLICATION TO AN ELECTRONIC HEALTH RECORDS-BASED STUDY OF BARIATRIC SURGERY
    Thaweethai, Tanayott
    Arterburn, David E.
    Coleman, Karen J.
    Haneuse, Sebastien
    ANNALS OF APPLIED STATISTICS, 2021, 15 (01) : 126 - 147
  • [18] Inverse weighting method with jackknife variance estimator for differential expression analysis of single-cell RNA sequencing data
    Zhou, Lingjie
    Pan, Qing
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2022, 100
  • [19] Dynamic Hierarchical Aggregation, Selection Bias Tracking, and Detailed Subset Comparison for High-Dimensional Event Sequence Data
    Zhang, Jonathan
    Borland, David
    Wang, Wenyuan
    Shrestha, Joshua
    Gotz, David
    2019 IEEE WORKSHOP ON VISUAL ANALYTICS IN HEALTHCARE (VAHC), 2019, : 56 - 57
  • [20] New variable selection strategy for analysis of high-dimensional DNA methylation data
    Choi, Jiyun
    Kim, Kipoong
    Sun, Hokeun
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2018, 16 (04)