Comparison of different cell type correction methods for genome-scale epigenetics studies

被引:55
作者
Kaushal, Akhilesh [1 ]
Zhang, Hongmei [1 ]
Karmaus, Wilfried J. J. [1 ]
Ray, Meredith [1 ]
Torres, Mylin A. [2 ,3 ]
Smith, Alicia K. [2 ,4 ]
Wang, Shu-Li [5 ]
机构
[1] Univ Memphis, Div Epidemiol Biostat & Environm Hlth, Memphis, TN 38152 USA
[2] Emory Univ, Winship Canc Inst, 1365 Clifton Rd NE, Atlanta, GA 30322 USA
[3] Emory Univ, Dept Radiat Oncol, Sch Med, 1365 Clifton Rd NE, Atlanta, GA 30322 USA
[4] Emory Univ, Dept Psychiat & Behav Sci, Sch Med, 101 Woodruff Circle,Suite 4000, Atlanta, GA 30322 USA
[5] Natl Hlth Res Inst, Natl Inst Environm Hlth Sci, Miaoli, Taiwan
来源
BMC BIOINFORMATICS | 2017年 / 18卷
关键词
Cell-type composition; CpG sites; Genome-scale DNA methylation; Surrogate variables; DNA METHYLATION; ARSENIC EXPOSURE; GENE-EXPRESSION; HETEROGENEITY; BIOCONDUCTOR; GENDER; BREAST; BLOOD; AGE;
D O I
10.1186/s12859-017-1611-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Whole blood is frequently utilized in genome-wide association studies of DNA methylation patterns in relation to environmental exposures or clinical outcomes. These associations can be confounded by cellular heterogeneity. Algorithms have been developed to measure or adjust for this heterogeneity, and some have been compared in the literature. However, with new methods available, it is unknown whether the findings will be consistent, if not which method(s) perform better. Results: Methods: We compared eight cell-type correction methods including the method in the minfi R package, the method by Houseman et al., the Removing unwanted variation (RUV) approach, the methods in FaST-LMMEWASher, ReFACTor, RefFreeEWAS, and RefFreeCellMix R programs, along with one approach utilizing surrogate variables (SVAs). We first evaluated the association of DNA methylation at each CpG across the whole genome with prenatal arsenic exposure levels and with cancer status, adjusted for estimated cell-type information obtained from different methods. We then compared CpGs showing statistical significance from different approaches. For the methods implemented in minfi and proposed by Houseman et al., we utilized homogeneous data with composition of some blood cells available and compared them with the estimated cell compositions. Finally, for methods not explicitly estimating cell compositions, we evaluated their performance using simulated DNA methylation data with a set of latent variables representing "cell types". Results: Results from the SVA-based method overall showed the highest agreement with all other methods except for FaST-LMM-EWASher. Using homogeneous data, minfi provided better estimations on cell types compared to the originally proposed method by Houseman et al. Further simulation studies on methods free of reference data revealed that SVA provided good sensitivities and specificities, RefFreeCellMix in general produced high sensitivities but specificities tended to be low when confounding is present, and FaST-LMM-EWASher gave the lowest sensitivity but highest specificity. Conclusions: Results from real data and simulations indicated that SVA is recommended when the focus is on the identification of informative CpGs. When appropriate reference data are available, the method implemented in the minfi package is recommended. However, if no such reference data are available or if the focus is not on estimating cell proportions, the SVA method is suggested.
引用
收藏
页数:12
相关论文
共 46 条
  • [31] DBnorm as an R package for the comparison and selection of appropriate statistical methods for batch effect correction in metabolomic studies
    Bararpour, Nasim
    Gilardi, Federica
    Carmeli, Cristian
    Sidibe, Jonathan
    Ivanisevic, Julijana
    Caputo, Tiziana
    Augsburger, Marc
    Grabherr, Silke
    Desvergne, Beatrice
    Guex, Nicolas
    Bochud, Murielle
    Thomas, Aurelien
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [32] Mega-scale Bayesian regression methods for genome-wide prediction and association studies with thousands of traits
    Qu, Jiayi
    Runcie, Daniel
    Cheng, Hao
    GENETICS, 2023, 223 (03)
  • [33] Testing cell-type-specific mediation effects in genome-wide epigenetic studies
    Luo, Xiangyu
    Schwartz, Joel
    Baccarelli, Andrea
    Liu, Zhonghua
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (03)
  • [34] Flexible comparison of batch correction methods for single-cell RNA-seq using BatchBench
    Chazarra-Gil, Ruben
    van Dongen, Stijn
    Kiselev, Vladimir Yu
    Hemberg, Martin
    NUCLEIC ACIDS RESEARCH, 2021, 49 (07)
  • [35] Guide for library design and bias correction for large-scale transcriptome studies using highly multiplexed RNAseq methods
    Shintaro Katayama
    Tiina Skoog
    Cilla Söderhäll
    Elisabet Einarsdottir
    Kaarel Krjutškov
    Juha Kere
    BMC Bioinformatics, 20
  • [36] Guide for library design and bias correction for large-scale transcriptome studies using highly multiplexed RNAseq methods
    Katayama, Shintaro
    Skoog, Tiina
    Soderhall, Cilla
    Einarsdottir, Elisabet
    Krjutskov, Kaarel
    Kere, Juha
    BMC BIOINFORMATICS, 2019, 20 (01)
  • [37] A Comparison of GAPDH and ACTB As Internal Control for Gene Expression Studies in Different Cancer Cell Lines
    Kose, Tugba
    UHOD-ULUSLARARASI HEMATOLOJI-ONKOLOJI DERGISI, 2024, 34 (03): : 140 - 147
  • [38] Genome-scale case-control analysis of CD4+ T-cell DNA methylation in juvenile idiopathic arthritis reveals potential targets involved in disease
    Justine A Ellis
    Jane E Munro
    Raul A Chavez
    Lavinia Gordon
    Jihoon E Joo
    Jonathan D Akikusa
    Roger C Allen
    Anne-Louise Ponsonby
    Jeffrey M Craig
    Richard Saffery
    Clinical Epigenetics, 2012, 4
  • [39] Genome-Scale Co-Expression Network Comparison across Escherichia coli and Salmonella enterica Serovar Typhimurium Reveals Significant Conservation at the Regulon Level of Local Regulators Despite Their Dissimilar Lifestyles
    Zarrineh, Peyman
    Sanchez-Rodriguez, Aminael
    Hosseinkhan, Nazanin
    Narimani, Zahra
    Marchal, Kathleen
    Masoudi-Nejad, Ali
    PLOS ONE, 2014, 9 (08):
  • [40] Genome-wide association studies of ionomic and agronomic traits in USDA mini core collection of rice and comparative analyses of different mapping methods
    Liu, Shuai
    Zhong, Hua
    Meng, Xiaoxi
    Sun, Tong
    Li, Yangsheng
    Pinson, Shannon R. M.
    Chang, Sam K. C.
    Peng, Zhaohua
    BMC PLANT BIOLOGY, 2020, 20 (01) : 441