Comparison of different cell type correction methods for genome-scale epigenetics studies

被引:55
作者
Kaushal, Akhilesh [1 ]
Zhang, Hongmei [1 ]
Karmaus, Wilfried J. J. [1 ]
Ray, Meredith [1 ]
Torres, Mylin A. [2 ,3 ]
Smith, Alicia K. [2 ,4 ]
Wang, Shu-Li [5 ]
机构
[1] Univ Memphis, Div Epidemiol Biostat & Environm Hlth, Memphis, TN 38152 USA
[2] Emory Univ, Winship Canc Inst, 1365 Clifton Rd NE, Atlanta, GA 30322 USA
[3] Emory Univ, Dept Radiat Oncol, Sch Med, 1365 Clifton Rd NE, Atlanta, GA 30322 USA
[4] Emory Univ, Dept Psychiat & Behav Sci, Sch Med, 101 Woodruff Circle,Suite 4000, Atlanta, GA 30322 USA
[5] Natl Hlth Res Inst, Natl Inst Environm Hlth Sci, Miaoli, Taiwan
来源
BMC BIOINFORMATICS | 2017年 / 18卷
关键词
Cell-type composition; CpG sites; Genome-scale DNA methylation; Surrogate variables; DNA METHYLATION; ARSENIC EXPOSURE; GENE-EXPRESSION; HETEROGENEITY; BIOCONDUCTOR; GENDER; BREAST; BLOOD; AGE;
D O I
10.1186/s12859-017-1611-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Whole blood is frequently utilized in genome-wide association studies of DNA methylation patterns in relation to environmental exposures or clinical outcomes. These associations can be confounded by cellular heterogeneity. Algorithms have been developed to measure or adjust for this heterogeneity, and some have been compared in the literature. However, with new methods available, it is unknown whether the findings will be consistent, if not which method(s) perform better. Results: Methods: We compared eight cell-type correction methods including the method in the minfi R package, the method by Houseman et al., the Removing unwanted variation (RUV) approach, the methods in FaST-LMMEWASher, ReFACTor, RefFreeEWAS, and RefFreeCellMix R programs, along with one approach utilizing surrogate variables (SVAs). We first evaluated the association of DNA methylation at each CpG across the whole genome with prenatal arsenic exposure levels and with cancer status, adjusted for estimated cell-type information obtained from different methods. We then compared CpGs showing statistical significance from different approaches. For the methods implemented in minfi and proposed by Houseman et al., we utilized homogeneous data with composition of some blood cells available and compared them with the estimated cell compositions. Finally, for methods not explicitly estimating cell compositions, we evaluated their performance using simulated DNA methylation data with a set of latent variables representing "cell types". Results: Results from the SVA-based method overall showed the highest agreement with all other methods except for FaST-LMM-EWASher. Using homogeneous data, minfi provided better estimations on cell types compared to the originally proposed method by Houseman et al. Further simulation studies on methods free of reference data revealed that SVA provided good sensitivities and specificities, RefFreeCellMix in general produced high sensitivities but specificities tended to be low when confounding is present, and FaST-LMM-EWASher gave the lowest sensitivity but highest specificity. Conclusions: Results from real data and simulations indicated that SVA is recommended when the focus is on the identification of informative CpGs. When appropriate reference data are available, the method implemented in the minfi package is recommended. However, if no such reference data are available or if the focus is not on estimating cell proportions, the SVA method is suggested.
引用
收藏
页数:12
相关论文
共 46 条
  • [1] Comparison of different cell type correction methods for genome-scale epigenetics studies
    Akhilesh Kaushal
    Hongmei Zhang
    Wilfried J. J. Karmaus
    Meredith Ray
    Mylin A. Torres
    Alicia K. Smith
    Shu-Li Wang
    BMC Bioinformatics, 18
  • [2] Genome-scale approaches to the epigenetics of common human disease
    Feinberg, Andrew P.
    VIRCHOWS ARCHIV, 2010, 456 (01) : 13 - 21
  • [3] Genome-scale approaches to the epigenetics of common human disease
    Andrew P. Feinberg
    Virchows Archiv, 2010, 456 : 13 - 21
  • [4] Genome-Scale Studies of Aging: Challenges and Opportunities
    McCormick, Mark A.
    Kennedy, Brian K.
    CURRENT GENOMICS, 2012, 13 (07) : 500 - 507
  • [5] Translatome profiling: methods for genome-scale analysis of mRNA translation
    King, Helen A.
    Gerber, Andre P.
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2016, 15 (01) : 22 - 31
  • [6] Microgenomics: Genome-Scale, Cell-Specific Monitoring of Multiple Gene Regulation Tiers
    Bailey-Serres, J.
    ANNUAL REVIEW OF PLANT BIOLOGY, VOL 64, 2013, 64 : 293 - 325
  • [7] Which methods to choose to correct cell types in genome-scale blood-derived DNA methylation data?
    Akhilesh Kaushal
    Hongmei Zhang
    Wilfried JJ Karmaus
    Julie SL Wang
    BMC Bioinformatics, 16
  • [8] A review of methods for the reconstruction and analysis of integrated genome-scale models of metabolism and regulation
    Cruz, Fernando
    Faria, Jose P.
    Rocha, Miguel
    Rocha, Isabel
    Dias, Oscar
    BIOCHEMICAL SOCIETY TRANSACTIONS, 2020, 48 (05) : 1889 - 1903
  • [9] Reprogramming cell fate with a genome-scale library of artificial transcription factors
    Eguchi, Asuka
    Wleklinski, Matthew J.
    Spurgat, Mackenzie C.
    Heiderscheit, Evan A.
    Kropornicka, Anna S.
    Vu, Catherine K.
    Bhimsaria, Devesh
    Swanson, Scott A.
    Stewart, Ron
    Ramanathan, Parameswaran
    Kamp, Timothy J.
    Slukvin, Igor
    Thomson, James A.
    Dutton, James R.
    Ansari, Aseem Z.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (51) : E8257 - E8266
  • [10] GENOME-SCALE DNA METHYLATION ANALYSIS OF SEVEN COMMONLY USED TROPHOBLAST CELL LINES
    Novakovic, B.
    Gordon, L.
    Craig, J.
    Saffery, R.
    PLACENTA, 2010, 31 (09) : A90 - A90