Fast and memory-efficient mapping of short bisulfite sequencing reads using a two-letter alphabet

被引:11
作者
Brandine, Guilherme de Sena [1 ]
Smith, Andrew D. [1 ]
机构
[1] Univ Southern Calif, Quantitat & Computat Biol, 1050 Childs Way, Los Angeles, CA 90007 USA
关键词
DNA METHYLATION;
D O I
10.1093/nargab/lqab115
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
DNA cytosine methylation is an important epigenomic mark with a wide range of functions in many organisms. Whole genome bisulfite sequencing is the gold standard to interrogate cytosine methylation genome-wide. Algorithms used to map bisulfite-converted reads often encode the four-base DNA alphabet with three letters by reducing two bases to a common letter. This encoding substantially reduces the entropy of nucleotide frequencies in the resulting reference genome. Within the paradigm of read mapping by first filtering possible candidate alignments, reduced entropy in the sequence space can increase the required computing effort. We introduce another bisulfite mapping algorithm (abismal), based on the idea of encoding a four-letter DNA sequence as only two letters, one for purines and one for pyrimidines. We show that this encoding can lead to greater specificity compared to existing encodings used to map bisulfite sequencing reads. Through the two-letter encoding, the abismal software tool maps reads in less time and using less memory than most bisulfite sequencing read mapping software tools, while attaining similar accuracy. This allows in silico methylation analysis to be performed in a wider range of computing machines with limited hardware settings.
引用
收藏
页数:9
相关论文
共 49 条
[1]   SneakySnake: a fast and accurate universal genome pre-alignment filter for CPUs, GPUs and FPGAs [J].
Alser, Mohammed ;
Shahroodi, Taha ;
Gomez-Luna, Juan ;
Alkan, Can ;
Mutlu, Onur .
BIOINFORMATICS, 2020, 36 (22-23) :5282-5290
[2]   The NIH Roadmap Epigenomics Mapping Consortium [J].
Bernstein, Bradley E. ;
Stamatoyannopoulos, John A. ;
Costello, Joseph F. ;
Ren, Bing ;
Milosavljevic, Aleksandar ;
Meissner, Alexander ;
Kellis, Manolis ;
Marra, Marco A. ;
Beaudet, Arthur L. ;
Ecker, Joseph R. ;
Farnham, Peggy J. ;
Hirst, Martin ;
Lander, Eric S. ;
Mikkelsen, Tarjei S. ;
Thomson, James A. .
NATURE BIOTECHNOLOGY, 2010, 28 (10) :1045-1048
[3]   Single-cell multiomics sequencing and analyses of human colorectal cancer [J].
Bian, Shuhui ;
Hou, Yu ;
Zhou, Xin ;
Li, Xianlong ;
Yong, Jun ;
Wang, Yicheng ;
Wang, Wendong ;
Yan, Jia ;
Hu, Boqiang ;
Guo, Hongshan ;
Wang, Jilian ;
Gao, Shuai ;
Mao, Yunuo ;
Dong, Ji ;
Zhu, Ping ;
Xiu, Dianrong ;
Yan, Liying ;
Wen, Lu ;
Qiao, Jie ;
Tang, Fuchou ;
Fu, Wei .
SCIENCE, 2018, 362 (6418) :1060-+
[4]   DNMT and HDAC inhibitors induce cryptic transcription start sites encoded in long terminal repeats [J].
Brocks, David ;
Schmidt, Christopher R. ;
Daskalakis, Michael ;
Jang, Hyo Sik ;
Shah, Nakul M. ;
Li, Daofeng ;
Li, Jing ;
Zhang, Bo ;
Hou, Yiran ;
Laudato, Sara ;
Lipka, Daniel B. ;
Schott, Johanna ;
Bierhoff, Holger ;
Assenov, Yassen ;
Helf, Monika ;
Ressnerova, Alzbeta ;
Islam, Md Saiful ;
Lindroth, Anders M. ;
Haas, Simon ;
Essers, Marieke ;
Imbusch, Charles D. ;
Brors, Benedikt ;
Oehme, Ina ;
Witt, Olaf ;
Luebbert, Michael ;
Mallm, Jan-Philipp ;
Rippe, Karsten ;
Will, Rainer ;
Weichenhan, Dieter ;
Stoecklin, Georg ;
Gerhaeuser, Clarissa ;
Oakes, Christopher C. ;
Wang, Ting ;
Plass, Christoph .
NATURE GENETICS, 2017, 49 (07) :1052-+
[5]   WALT: fast and accurate read mapping for bisulfite sequencing [J].
Chen, Haifeng ;
Smith, Andrew D. ;
Chen, Ting .
BIOINFORMATICS, 2016, 32 (22) :3507-3509
[6]   Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning [J].
Cokus, Shawn J. ;
Feng, Suhua ;
Zhang, Xiaoyu ;
Chen, Zugen ;
Merriman, Barry ;
Haudenschild, Christian D. ;
Pradhan, Sriharsa ;
Nelson, Stanley F. ;
Pellegrini, Matteo ;
Jacobsen, Steven E. .
NATURE, 2008, 452 (7184) :215-219
[7]   Evolution of DNA Methylome Diversity in Eukaryotes [J].
de Mendoza, Alex ;
Lister, Ryan ;
Bogdanovic, Ozren .
JOURNAL OF MOLECULAR BIOLOGY, 2020, 432 (06) :1687-1705
[8]  
De Sena Brandine G., 2021, **DATA OBJECT**, DOI 10.5281/zenodo.5711884
[9]   DNA Methylation Divergence and Tissue Specialization in the Developing Mouse Placenta [J].
Decato, Benjamin E. ;
Lopez-Tello, Jorge ;
Sferruzzi-Perri, Amanda N. ;
Smith, Andrew D. ;
Dean, Matthew D. .
MOLECULAR BIOLOGY AND EVOLUTION, 2017, 34 (07) :1702-1712
[10]   Allele-specific DNA methylation is increased in cancers and its dense mapping in normal plus neoplastic cells increases the yield of disease-associated regulatory SNPs [J].
Do, Catherine ;
Dumont, Emmanuel L. P. ;
Salas, Martha ;
Castano, Angelica ;
Mujahed, Huthayfa ;
Maldonado, Leonel ;
Singh, Arunjot ;
DaSilva-Arnold, Sonia C. ;
Bhagat, Govind ;
Lehman, Soren ;
Christiano, Angela M. ;
Madhavan, Subha ;
Nagy, Peter L. ;
Green, Peter H. R. ;
Feinman, Rena ;
Trimble, Cornelia ;
Illsley, Nicholas P. ;
Marder, Karen ;
Honig, Lawrence ;
Monk, Catherine ;
Goy, Andre ;
Chow, Kar ;
Goldlust, Samuel ;
Kaptain, George ;
Siegel, David ;
Tycko, Benjamin .
GENOME BIOLOGY, 2020, 21 (01)