Performance of Mapping Approaches for Whole-Genome Bisulfite Sequencing Data in Crop Plants

被引:13
作者
Grehl, Claudius [1 ,2 ]
Wagner, Marc [3 ]
Lemnian, Ioana [1 ,4 ]
Glaser, Bruno [2 ]
Grosse, Ivo [1 ,5 ]
机构
[1] Martin Luther Univ Halle Wittenberg, Inst Comp Sci, Bioinformat, von Seckendorff Pl 1, Halle, Saale, Germany
[2] Martin Luther Univ Halle Wittenberg, Inst Agron & Nutr Sci, Soil Biogeochem, von Seckendorff Pl 3, Halle, Saale, Germany
[3] Free Univ Berlin, Inst Math & Informat, Berlin, Germany
[4] Martin Luther Univ Halle Wittenberg, Inst Human Genet, Halle, Saale, Germany
[5] German Ctr Integrat Biodivers Res iDiv, Bioinformat Unit, Leipzig, Germany
关键词
epigenetics; DNA methylation patterns; read mapping; benchmarking; WGBS; READ ALIGNMENT; WIDE; ACCURATE;
D O I
10.3389/fpls.2020.00176
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
DNA methylation is involved in many different biological processes in the development and well-being of crop plants such as transposon activation, heterosis, environment-dependent transcriptome plasticity, aging, and many diseases. Whole-genome bisulfite sequencing is an excellent technology for detecting and quantifying DNA methylation patterns in a wide variety of species, but optimized data analysis pipelines exist only for a small number of species and are missing for many important crop plants. This is especially important as most existing benchmark studies have been performed on mammals with hardly any repetitive elements and without CHG and CHH methylation. Pipelines for the analysis of whole-genome bisulfite sequencing data usually consists of four steps: read trimming, read mapping, quantification of methylation levels, and prediction of differentially methylated regions (DMRs). Here we focus on read mapping, which is challenging because un-methylated cytosines are transformed to uracil during bisulfite treatment and to thymine during the subsequent polymerase chain reaction, and read mappers must be capable of dealing with this cytosine/thymine polymorphism. Several read mappers have been developed over the last years, with different strengths and weaknesses, but their performances have not been critically evaluated. Here, we compare eight read mappers: Bismark, BismarkBwt2, BSMAP, BS-Seeker2, Bwameth, GEM3, Segemehl, and GSNAP to assess the impact of the read-mapping results on the prediction of DMRs. We used simulated data generated from the genomes of Arabidopsis thaliana, Brassica napus, Glycine max, Solanum tuberosum, and Zea mays, monitored the effects of the bisulfite conversion rate, the sequencing error rate, the maximum number of allowed mismatches, as well as the genome structure and size, and calculated precision, number of uniquely mapped reads, distribution of the mapped reads, run time, and memory consumption as features for benchmarking the eight read mappers mentioned above. Furthermore, we validated our findings using real-world data of Glycine max and showed the influence of the mapping step on DMR calling in WGBS pipelines. We found that the conversion rate had only a minor impact on the mapping quality and the number of uniquely mapped reads, whereas the error rate and the maximum number of allowed mismatches had a strong impact and leads to differences of the performance of the eight read mappers. In conclusion, we recommend BSMAP which needs the shortest run time and yields the highest precision, and Bismark which requires the smallest amount of memory and yields precision and high numbers of uniquely mapped reads.
引用
收藏
页数:15
相关论文
共 45 条
[1]   A MBD-seq protocol for large-scale methylome-wide studies with (very) low amounts of DNA [J].
Aberg, Karolina A. ;
Chan, Robin F. ;
Shabalin, Andrey A. ;
Zhao, Min ;
Turecki, Gustavo ;
Staunstrup, Nicklas Heine ;
Starnawska, Anna ;
Mors, Ole ;
Xie, Lin Y. ;
van den Oord, Edwin J. C. G. .
EPIGENETICS, 2017, 12 (09) :743-750
[2]  
[Anonymous], COMPENDIUM PLANT GEN
[3]  
[Anonymous], BIOINF OXFORD ENGLAN
[4]   Quantitative comparison of genome-wide DNA methylation mapping technologies [J].
Bock, Christoph ;
Tomazou, Eleni M. ;
Brinkman, Arie B. ;
Mueller, Fabian ;
Simmer, Femke ;
Gu, Hongcang ;
Jaeger, Natalie ;
Gnirke, Andreas ;
Stunnenberg, Hendrik G. ;
Meissner, Alexander .
NATURE BIOTECHNOLOGY, 2010, 28 (10) :1106-U196
[5]   Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome [J].
Chalhoub, Boulos ;
Denoeud, France ;
Liu, Shengyi ;
Parkin, Isobel A. P. ;
Tang, Haibao ;
Wang, Xiyin ;
Chiquet, Julien ;
Belcram, Harry ;
Tong, Chaobo ;
Samans, Birgit ;
Correa, Margot ;
Da Silva, Corinne ;
Just, Jeremy ;
Falentin, Cyril ;
Koh, Chu Shin ;
Le Clainche, Isabelle ;
Bernard, Maria ;
Bento, Pascal ;
Noel, Benjamin ;
Labadie, Karine ;
Alberti, Adriana ;
Charles, Mathieu ;
Arnaud, Dominique ;
Guo, Hui ;
Daviaud, Christian ;
Alamery, Salman ;
Jabbari, Kamel ;
Zhao, Meixia ;
Edger, Patrick P. ;
Chelaifa, Houda ;
Tack, David ;
Lassalle, Gilles ;
Mestiri, Imen ;
Schnel, Nicolas ;
Le Paslier, Marie-Christine ;
Fan, Guangyi ;
Renault, Victor ;
Bayer, Philippe E. ;
Golicz, Agnieszka A. ;
Manoli, Sahana ;
Lee, Tae-Ho ;
Vinh Ha Dinh Thi ;
Chalabi, Smahane ;
Hu, Qiong ;
Fan, Chuchuan ;
Tollenaere, Reece ;
Lu, Yunhai ;
Battail, Christophe ;
Shen, Jinxiong ;
Sidebottom, Christine H. D. .
SCIENCE, 2014, 345 (6199) :950-953
[6]   BS Seeker: precise mapping for bisulfite sequencing [J].
Chen, Pao-Yang ;
Cokus, Shawn J. ;
Pellegrini, Matteo .
BMC BIOINFORMATICS, 2010, 11
[7]   Plasticity of DNA methylation and gene expression under zinc deficiency in Arabidopsis roots [J].
Chen, Xiaochao ;
Schoenberger, Brigitte ;
Menz, Jochen ;
Ludewig, Uwe .
PLANT AND CELL PHYSIOLOGY, 2018, 59 (09) :1790-1802
[8]   Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus [J].
Condon, David E. ;
Tran, Phu V. ;
Lien, Yu-Chin ;
Schug, Jonathan ;
Georgieff, Michael K. ;
Simmons, Rebecca A. ;
Won, Kyoung-Jae .
BMC BIOINFORMATICS, 2018, 19
[9]   A comparison of tools for the simulation of genomic next-generation sequencing data [J].
Escalona, Merly ;
Rocha, Sara ;
Posada, David .
NATURE REVIEWS GENETICS, 2016, 17 (08) :459-469
[10]   Considering Transposable Element Diversification in De Novo Annotation Approaches [J].
Flutre, Timothee ;
Duprat, Elodie ;
Feuillet, Catherine ;
Quesneville, Hadi .
PLOS ONE, 2011, 6 (01)