Non-random sampling leads to biased estimates of transcriptome association

被引:0
|
作者
A. S. Foulkes
R. Balasubramanian
J. Qian
M. P. Reilly
机构
[1] Massachusetts General Hospital,
[2] Harvard Medical School,undefined
[3] Department of Medicine,undefined
[4] Biostatistics,undefined
[5] University of Massachusetts,undefined
[6] Department of Biostatistics and Epidemiology,undefined
[7] Columbia University,undefined
[8] Cardiology Division,undefined
[9] Department of Medicine and the Irving Institute for Clinical and Translational Sciences,undefined
来源
Scientific Reports | / 10卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Integration of independent data resources across -omics platforms offers transformative opportunity for novel clinical and biological discoveries. However, application of emerging analytic methods in the context of selection bias represents a noteworthy and pervasive challenge. We hypothesize that combining differentially selected samples for integrated transcriptome analysis will lead to bias in the estimated association between predicted expression and the trait. Our results are based on in silico investigations and a case example focused on body mass index across four well-described cohorts apparently derived from markedly different populations. Our findings suggest that integrative analysis can lead to substantial relative bias in the estimate of association between predicted expression and the trait. The average estimate of association ranged from 51.3% less than to 96.7% greater than the true value for the biased sampling scenarios considered, while the average error was − 2.7% for the unbiased scenario. The corresponding 95% confidence interval coverage rate ranged from 46.4% to 69.5% under biased sampling, and was equal to 75% for the unbiased scenario. Inverse probability weighting with observed and estimated weights is applied as one corrective measure and appears to reduce the bias and improve coverage. These results highlight a critical need to address selection bias in integrative analysis and to use caution in interpreting findings in the presence of different sampling mechanisms between groups.
引用
收藏
相关论文
共 50 条
  • [1] Non-random sampling leads to biased estimates of transcriptome association
    Foulkes, A. S.
    Balasubramanian, R.
    Qian, J.
    Reilly, M. P.
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [2] Non-random sampling and association tests on realized returns and risk proxies
    Ecker, Frank
    Francis, Jennifer
    Olsson, Per
    Schipper, Katherine
    REVIEW OF ACCOUNTING STUDIES, 2021, 26 (02) : 772 - 814
  • [3] Non-random sampling and association tests on realized returns and risk proxies
    Frank Ecker
    Jennifer Francis
    Per Olsson
    Katherine Schipper
    Review of Accounting Studies, 2021, 26 : 772 - 814
  • [4] PROBLEMS OF DEFINING QUOTAS IN NON-RANDOM SAMPLING
    DEROO, M
    METRA, 1973, 12 (01): : 141 - 157
  • [5] Semiparametric location estimation under non-random sampling
    Genton, Marc G.
    Kim, Mijeong
    Ma, Yanyuan
    STAT, 2012, 1 (01): : 1 - 11
  • [6] NON-RANDOM ASSOCIATION OF HUMAN ACROCENTRIC CHROMOSOMES
    PATIL, SR
    LUBS, HA
    HUMANGENETIK, 1971, 13 (02): : 157 - &
  • [7] Effect of non-random sampling on the estimation of parameters in population genetics
    Tajima, F
    GENETICS RESEARCH, 1995, 66 (03) : 267 - 276
  • [8] A non-random data sampling method for classification model assessment
    Sprevak, D
    Azuaje, F
    Wang, HY
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, 2004, : 406 - 409
  • [9] NON-RANDOM SAMPLING OF INDIVIDUALS IN CROSS-CULTURAL RESEARCH
    BRISLIN, RW
    BAUMGARD.SR
    JOURNAL OF CROSS-CULTURAL PSYCHOLOGY, 1971, 2 (04) : 397 - 400
  • [10] Porphyria Cutanea Tarda and Spherocytosis: A Non-random Association?
    Du-Thanh, Aurelie
    Aguilar-Martinez, Patricia
    Enescu, Cecilia
    Cunat, Severine
    Guillot, Bernard
    Dereure, Olivier
    ACTA DERMATO-VENEREOLOGICA, 2013, 93 (03) : 377 - 378