MethyQA: a pipeline for bisulfite-treated methylation sequencing quality assessment

被引:14
作者
Sun, Shuying [1 ,2 ]
Noviski, Aaron [3 ]
Yu, Xiaoqing [1 ]
机构
[1] Case Western Reserve Univ, Dept Epidemiol & Biostat, Cleveland, OH 44106 USA
[2] Texas State Univ, Dept Math, San Marcos, TX 78666 USA
[3] Case Western Reserve Univ, Dept Elect Engn & Comp Sci, Cleveland, OH 44106 USA
来源
BMC BIOINFORMATICS | 2013年 / 14卷
关键词
DNA methylation; Next generation sequencing; Alignment; BRAT; Quality assessment; DNA METHYLATION; BREAST-CANCER; CPG ISLANDS; HYPERMETHYLATION; PLURIPOTENT; EFFICIENT; ALIGNMENT; MARKERS; COLON; MAPS;
D O I
10.1186/1471-2105-14-259
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: DNA methylation is an epigenetic event that adds a methyl-group to the 5' cytosine. This epigenetic modification can significantly affect gene expression in both normal and diseased cells. Hence, it is important to study methylation signals at the single cytosine site level, which is now possible utilizing bisulfite conversion technique (i.e., converting unmethylated Cs to Us and then to Ts after PCR amplification) and next generation sequencing (NGS) technologies. Despite the advances of NGS technologies, certain quality issues remain. Some of the more prevalent quality issues involve low per-base sequencing quality at the 3' end, PCR amplification bias, and bisulfite conversion rates. Therefore, it is important to conduct quality assessment before downstream analysis. To the best of our knowledge, no existing software packages can generally assess the quality of methylation sequencing data generated based on different bisulfite-treated protocols. Results: To conduct the quality assessment of bisulfite methylation sequencing data, we have developed a pipeline named MethyQA. MethyQA combines currently available open-source software packages with our own custom programs written in Perl and R. The pipeline can provide quality assessment results for tens of millions of reads in under an hour. The novelty of our pipeline lies in its examination of bisulfite conversion rates and of the DNA sequence structure of regions that have different conversion rates or coverage. Conclusions: MethyQA is a new software package that provides users with a unique insight into the methylation sequencing data they are researching. It allows the users to determine the quality of their data and better prepares them to address the research questions that lie ahead. Due to the speed and efficiency at which MethyQA operates, it will become an important tool for studies dealing with bisulfite methylation sequencing data.
引用
收藏
页数:9
相关论文
共 43 条
  • [31] Integrated Analysis of Gene Expression, CpG Island Methylation, and Gene Copy Number in Breast Cancer Cells by Deep Sequencing
    Sun, Zhifu
    Asmann, Yan W.
    Kalari, Krishna R.
    Bot, Brian
    Eckel-Passow, Jeanette E.
    Baker, Tiffany R.
    Carr, Jennifer M.
    Khrebtukova, Irina
    Luo, Shujun
    Zhang, Lu
    Schroth, Gary P.
    Perez, Edith A.
    Thompson, E. Aubrey
    [J]. PLOS ONE, 2011, 6 (02):
  • [32] Variable promoter region CpG island methylation of the putative tumor suppressor gene Connexin 26 in breast cancer
    Tan, LW
    Bianco, T
    Dobrovic, A
    [J]. CARCINOGENESIS, 2002, 23 (02) : 231 - 236
  • [33] CpG island methylator phenotype in colorectal cancer
    Toyota, M
    Ahuja, N
    Ohe-Toyota, M
    Herman, JG
    Baylin, SB
    Issa, JPJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (15) : 8681 - 8686
  • [34] Aberrant DNA methylation in ovarian cancer is there an epigenetic predisposition to drug response?
    Wei, SH
    Brown, R
    Huang, THM
    [J]. EPIGENETICS IN CANCER PREVENTION: EARLY DETECTION AND RISK ASSESSMENT, 2003, 983 : 243 - 250
  • [35] DNA methylation and breast carcinogenesis
    Widschwendter, M
    Jones, PA
    [J]. ONCOGENE, 2002, 21 (35) : 5462 - 5482
  • [36] Circulating methylated DNA: A new generation of tumor markers
    Widschwendter, Martin
    Menon, Usha
    [J]. CLINICAL CANCER RESEARCH, 2006, 12 (24) : 7205 - 7208
  • [37] Fast and SNP-tolerant detection of complex variants and splicing in short reads
    Wu, Thomas D.
    Nacu, Serban
    [J]. BIOINFORMATICS, 2010, 26 (07) : 873 - 881
  • [38] Xi YP, 2009, BMC BIOINFORMATICS, V10, DOI [10.1186/1471-2105-10-S1-S58, 10.1186/1471-2105-10-232]
  • [39] Yan PS, 2000, CLIN CANCER RES, V6, P1432
  • [40] DNA methylation in breast cancer
    Yang, X
    Yan, L
    Davidson, NE
    [J]. ENDOCRINE-RELATED CANCER, 2001, 8 (02) : 115 - 127