Characterization and machine learning prediction of allele-specific DNA methylation

被引:7
|
作者
He, Jianlin [1 ]
Sun, Ming-an [2 ]
Wang, Zhong [3 ,4 ]
Wang, Qianfei [1 ]
Li, Qing [3 ,4 ]
Xie, Hehuang [1 ,2 ,5 ]
机构
[1] Chinese Acad Sci, Beijing Inst Genom, Lab Genome Variat & Precis Biomed, Beijing 100101, Peoples R China
[2] Virginia Tech, Epigenom & Computat Biol Lab, Virginia Bioinformat Inst, Blacksburg, VA 24060 USA
[3] Sun Yat Sen Univ, Sch Pharmaceut Sci, Guangzhou 510080, Guangdong, Peoples R China
[4] Sun Yat Sen Univ, Ctr Cellular & Struct Biol, Guangzhou 510080, Guangdong, Peoples R China
[5] Virginia Tech, Dept Biol Sci, Blacksburg, VA 24060 USA
基金
美国国家科学基金会;
关键词
Allele-specific DNA methylation; SNP; Epigenetic variation; Logistic regression classifier; GENE; ENHANCERS; SEQUENCE; CPGS;
D O I
10.1016/j.ygeno.2015.09.007
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
A large collection of Single Nucleotide Polymorphisms (SNPs) has been identified in the human genome. Currently, the epigenetic influences of SNPs on their neighboring CpG sites remain elusive. A growing body of evidence suggests that locus-specific information, including genomic features and local epigenetic state, may play important roles in the epigenetic readout of SNPs. In this study, we made use of mouse methylomes with known SNPs to develop statistical models for the prediction of SNP associated allele-specific DNA methylation (ASM). ASM has been classified into parent-of-origin dependent ASM (P-ASM) and sequence-dependent ASM (S-ASM), which comprises scattered-S-ASM (sS-ASM) and clustered-S-ASM (cS-ASM). We found that P-ASM and cS-ASM CpG sites are both enriched in CpG rich regions, promoters and exons, while sS-ASM CpG sites are enriched in simple repeat and regions with high frequent SNP occurrence. Using Lasso-grouped Logistic Regression (LGLR), we selected 21 out of 282 genomic and methylation related features that are powerful in distinguishing cS-ASM CpG sites and trained the classifiers with machine learning techniques. Based on 5-fold cross-validation, the logistic regression classifier was found to be the best for cS-ASM prediction with an ACC of 0.77, an AUC of 0.84 and an MCC of 0.54. Lastly, we applied the logistic regression classifier on human brain methylome and predicted 608 genes associated with cS-ASM. Gene ontology term enrichment analysis indicated that these cS-ASM associated genes are significantly enriched in the category coding for transcripts with alternative splicing forms. In summary, this study provided an analytical procedure for cS-ASM prediction and shed new light on the understanding of different types of ASM events. Published by Elsevier Inc.
引用
收藏
页码:331 / 339
页数:9
相关论文
共 50 条
  • [21] Prediction of smoking by multiplex bisulfite PCR with long amplicons considering allele-specific effects on DNA methylation
    Kondratyev, Nikolay
    Golov, Arkady
    Alfimova, Margarita
    Lezheiko, Tatiana
    Golimbet, Vera
    CLINICAL EPIGENETICS, 2018, 10
  • [22] Development of super-specific epigenome editing by targeted allele-specific DNA methylation
    Rajaram, Nivethika
    Kouroukli, Alexandra G.
    Bens, Susanne
    Bashtrykov, Pavel
    Jeltsch, Albert
    EPIGENETICS & CHROMATIN, 2023, 16 (01)
  • [23] Development of super-specific epigenome editing by targeted allele-specific DNA methylation
    Nivethika Rajaram
    Alexandra G. Kouroukli
    Susanne Bens
    Pavel Bashtrykov
    Albert Jeltsch
    Epigenetics & Chromatin, 16
  • [24] Correction to: Prediction of smoking by multiplex bisulfite PCR with long amplicons considering allele-specific effects on DNA methylation
    Nikolay Kondratyev
    Arkady Golov
    Margarita Alfimova
    Tatiana Lezheiko
    Vera Golimbet
    Clinical Epigenetics, 2018, 10
  • [25] MethHaplo: combining allele-specific DNA methylation and SNPs for haplotype region identification
    Qiangwei Zhou
    Ze Wang
    Jing Li
    Wing-Kin Sung
    Guoliang Li
    BMC Bioinformatics, 21
  • [26] Mechanisms and Disease Associations of Haplotype-Dependent Allele-Specific DNA Methylation
    Do, Catherine
    Lang, Charles F.
    Lin, John
    Darbary, Huferesh
    Krupska, Izabela
    Gaba, Aulona
    Petukhova, Lynn
    Vonsattel, Jean-Paul
    Gallagher, Mary P.
    Goland, Robin S.
    Clynes, Raphael A.
    Dwork, Andrew
    Kral, John G.
    Monk, Catherine
    Christiano, Angela M.
    Tycko, Benjamin
    AMERICAN JOURNAL OF HUMAN GENETICS, 2016, 98 (05) : 934 - 955
  • [27] Allele-specific DNA methylation reinforces PEAR1 enhancer activity
    Izzi, Benedetta
    Pistoni, Mariaelena
    Cludts, Katrien
    Akkor, Pinar
    Lambrechts, Diether
    Verfaillie, Catherine
    Verhamme, Peter
    Freson, Kathleen
    Hoylaerts, Marc F.
    BLOOD, 2016, 128 (07) : 1003 - 1012
  • [28] Chromosome-Wide Analysis of Parental Allele-Specific Chromatin and DNA Methylation
    Singh, Purnima
    Wu, Xiwei
    Lee, Dong-Hoon
    Li, Arthur X.
    Rauch, Tibor A.
    Pfeifer, Gerd P.
    Mann, Jeffrey R.
    Szabo, Piroska E.
    MOLECULAR AND CELLULAR BIOLOGY, 2011, 31 (08) : 1757 - 1770
  • [29] Detection of haplotype-dependent allele-specific DNA methylation in WGBS data
    J. Abante
    Y. Fang
    A. P. Feinberg
    J. Goutsias
    Nature Communications, 11
  • [30] MethHaplo: combining allele-specific DNA methylation and SNPs for haplotype region identification
    Zhou, Qiangwei
    Wang, Ze
    Li, Jing
    Sung, Wing-Kin
    Li, Guoliang
    BMC BIOINFORMATICS, 2020, 21 (01)